PROBLEM 1 Select frequent words (whose count is equal or

Select frequent words (whose count is equal or greater than 50,000). 
Display the frequent words in descending order. 
Get groups of words by their length (Hint: use the built-in function SIZE) and count each group.
For example,
(2,1096049) means that there are 1096049 occurrence of words that have two characters.
Problem 3 is based on dataset nyc_taxi_data_2014.csv.gz
Find the effect of passenger_count on trip_distance, fare_amount, and tip_rate.
a) Create a new data set records2 that has passenger_count, trip_distance, fare_amount, 
tip_rate (tip_amount/total_amount)
b) Filter records2 by passenger_count (0 < passenger_count < 10) and name the data set as 
c) Group records3 by passenger_count. 
d) Display the average trip_distance, average fare_amount, and average tip_rate per each 
group of passenger_count.

