Friday, March 31, 2017

Code used for the Grab Challenge 2017

Yey! we did not win! 

Let me share our scripts for the submission on: The DataSeer Grab Challenge 2017





First, let us import the needed libraries.


Then process the data set using pandas. First, read_csv to open the file.We used the column "created_at_local" to derive the "date", "time" and "day of week" columns using the datetime library.

Here is a sample of the extracted dataframe. The dataframe has 265073 rows and 13 columns.


The "city" and "city_only" dataframes were created. We'll keep "city" for later (see Fig 3).

Findings on Pickup Distance:
Fig 1: This is a summary of 2013 trips showing the average pickup distance (in Km.) per city. The data shows us that a trip gets cancelled if average pickup point is 1.8Km away and would be completed below the 1.8Km mark.


Fig 1: ax1
Findings on AAR:
Fig 2: This graph shows the Actual Allocation Rate (AAR) per city for 2013. AAR is below 50% for all cities.



Fig 2: ax2
Recommendation on AAR:
Fig 3: If sources are considered for the AAR data, it can be seen that VNU for Metro Manila has the highest AAR which is above 50%. We have utilized the "city" dataframe here.
*It might be a good idea to promote VNU in other cities.

Fig 3: ax3
Findings on the Relationship of Trip State with Time:
Fig 4: This graph shows the number of trip status per hour in Cebu. Cebu had a high unallocated rate from 5pm to 7pm.


A new dataframe called "time" is created. This df uses time (in hours) instead of dates.



Fig 4: ax4


Fig 5: This graph shows the number of trip status per hour in  Davao. It is seen that Davao had a high unallocated rate at around 5pm and started to go down by 8pm.


Fig 5: ax5
Fig 6: This graph shows the number of trip status per hour in Metro Manila. Metro Manila had an initial spike on unallocated trips which started on the early commute hour of 5am and dropped at around 10 am and another spike which started to rise at 3pm and went down by 9pm. Highest peak at around 6pm.


Fig 6: ax6


Fig 7: Cebu had high travel count during Monday, Friday and Saturday. The daily dataframe shows the trip status and fare per day sorted per city. This also shows the daily daily pickup distance and corresponding trip status.

Fig 7: ax7


Fig 8: Davao peaked its travel count every Thursday.


Fig 8: ax8


Fig 9: Metro Manila's Peak is Friday.

Fig 9: ax9



We hope you have picked up a thing or two...Your comments are very welcome! 
...pardon on the html boxes...lol 

No comments:

Post a Comment