Sunday, February 26, 2017

The DataSeer Grab Challenge 2017


The Dataset has been generously provided by Grab Philippines:
  • The dimensions of the dataset is (265073 rows, 10 columns).
  • The period covered is 2013 only. The only available taxi type is GrabTaxi (GrabCar wasn't offered yet here back then, or any other taxi types for that matter).
  • Cities covered are Metro Manila, Cebu, and Davao.
  • Source is the channel whereby the booking was made. ADR is for Android smartphones, IOS for iPhones, VNU are for partner venues (like events or hotels, for example), T47 and COM aremanually created from computers (for those who called in back then or were manually assigned by callouts), WIN is for Windows smartphones, and BBA for Blackberry.
  • There are two general states for a booking: allocated and unallocated. Allocated means the booking was paired with a driver. Allocated is broken down to cancelled bookings and completedrides.
  • One of our measures of success is the allocation rate, or the rate of successful matching we can do. Another is actual allocation rate, or the rate of completed matches that actually happened. The formulas are:
    • AR = (Allocated) / (Allocated + Unallocated)
    • AAR = (Completed) / (Allocated + Unallocated)
  • fares are in PHP. As this is GrabTaxi, the formula to the fare is just meter fare + booking fee. In 2013, the BF is only PHP40.
  • pick_up_distance is the distance of the driver from the passenger. This is measured via roaddistance and not by straight line. The optimal distance back in 2013 due to large mismatchbetween supply and demand was 3 kilometers. Anything outside that is considered an outlierand/or a very bad match (i.e. increases chance that it will be cancelled due to ETA, etc.).


DataFrame Sample:
In[1]:df1.head()
Out[1]:
  source    created_at_local  pick_up_latitude  pick_up_longitude  \
0    ADR 2013-09-22 23:46:18         14.604348         120.998654  
1    T47 2013-11-04 03:51:59         14.590099         121.082645  
2    T47 2013-11-21 05:21:24         14.582707         121.061458  
3    ADR 2013-09-16 20:53:34         14.585812         121.060171  
4    IOS 2013-09-10 23:49:16         14.552010         121.051260  

   drop_off_latitude  drop_off_longitude          city     fare  \
0          14.537370          120.994423  Metro Manila  281.875  
1          14.508611          121.019444  Metro Manila  413.125  
2          14.537752          121.001379  Metro Manila  277.500  
3          14.575915          121.085487  Metro Manila  220.625  
4          14.630210          120.995920  Metro Manila  378.125  

   pick_up_distance      state        date  time day of week 
0          0.389894  CANCELLED  2013-09-22    23      Sunday 
1          2.209770  COMPLETED  2013-11-04     3      Monday 
2          2.702910  COMPLETED  2013-11-21     5    Thursday 
3          0.321403  CANCELLED  2013-09-16    20      Monday 
4          0.667067  COMPLETED  2013-09-10    23     Tuesday


Fig. 1

Findings on Pickup Distance:
Fig 1: This is a summary of 2013 trips showing the average pickup distance (in Km.) per city. The data shows us that a trip gets cancelled if average pickup point is 1.8Km away and would be completed below the 1.8Km mark.

Fig. 2

Fig. 3













Findings on AAR:
Fig 2: This graph shows the Actual Allocation Rate (AAR) per city for 2013. AAR is below 50% for all cities.















Recommendation on AAR:
Fig 3: If sources are considered for the AAR data, it can be seen that VNU for Metro Manila has the highest AAR which is above 50%.
*It might be a good idea to promote VNU in other cities.













  
 

Fig. 4

Findings on the Relationship of Trip State with Time:
Fig 4: This graph shows the number of trip status per hour in Cebu. Cebu had a high unallocated rate from 5pm to 7pm.








Fig. 5

Fig. 6








Fig 5: This graph shows the number of trip status per hour in  Davao. It is seen that Davao had a high unallocated rate at around 5pm and started to go down by 8pm.






Fig 6: This graph shows the number of trip status per hour in Metro Manila. Metro Manila had an initial spike on unallocated trips which started on the early commute hour of 5am and dropped at around 10 am and another spike which started to rise at 3pm and went down by 9pm. Highest peak at around 6pm.

Recommendations to address unallocated trips:
*Most of the spikes happen during rush hour. It might be a good idea to have carpool promos or cars with more that 4 passenger capacity.
*Point to point travel might be considered as well.








Findings on the Trip Status per Day:
Fig. 7
Fig 7 to 9 represents the number of trip status per day for the different cities.

 Fig 7: Cebu had high travel count during Monday, Friday and Saturday.

Fig 8: Davao peaked its travel count every Thursday.

Fig 9: Metro Manila's Peak is Friday.


Recommendation:
*As with the recommendation mentioned above, Pooling promos and large capacity vehicles are encouraged to be maximized on the peak days of the week.



Fig. 8

Fig. 9















































The noob team is composed of:
Edmil Sta. Maria
Anthony Lazam

No comments:

Post a Comment