LONDON BICYCLE RENTALS

View associated files on GitHub

Tools and Languages: SQL, BigQuery

Description and Intent: This project used Google's BigQuery to explore and analyse a large public database containing data about bicycle rentals from the publicly availably cycle stations within London, England. The database contained two tables: 'cycle_hire' with 83,434,866 rows and 'cycle_stations' with 795 rows. 

Explorations were intended to remain within BigQuery only and used the following set of questions to guide the analysis:

Insights and Reporting: Queries were conducted with various techniques including aggregate functions, joins, CTEs and windows functions. The code was formatted to resemble more closely BigQuery's suggested auto-formatting; this was similar to previous project's code formatting, but code was spread across a greater number of rows to differentiate steps more clearly.

Queried results to the set questions were as follows:

1. What is the first rental start date and last rental end date? ie. What span of time does the database cover?

2. How has the number of rentals changed over time? These results showed growth in all years but 2019 and 2020 (excluding 2023 for which there are only some January data available). 2015-16 and 2020-21 showed the greatest overall growth; it is only speculation, but the 2020-21 growth could be attributed to some COVID-19 restrictions lifting in London. The two years of negative growth are not immensely significant, but they are important to note in the context of generally small growth each year.

3. What is the longest, shortest and average rental duration?

4. How many different bikes and bike models have been in use over the full period?

5. How many times has each bike been rented? Can the highest number of rentals be attributed to older bikes? ie. Are many bikes being used more/less than expected? These results reflect the bikes with the greatest number of rentals and it can be seen that they correspond with some of the oldest bikes in the fleet. Similarly, the newest bikes in the fleet are associated with very small numbers of rentals. Further investigation is needed to conclude with confidence that the use of bikes is relatively even across the fleet and reflective of bike ages.

6. What are the most popular starting stations?

7. Are there any increasing or decreasing trends for use of the top starting station? The use of the Hyde Park Corner cycle station (as a starting point) seems to vary considerably from year to year; further investigation into reasons for this would be required prior to making any assumptions about this observation.

8. What are the most popular finishing stations? Hyde Park Corner sits at the top of the starting and finishing stations lists. Given the size of the park and available pathways, as well as the number of visitors the park receives each year, this does not seem like a surprising outcome. It suggests that bikes are rented for pleasure and to view the park before being returned to the starting station.

9. How many journeys start and finish at the same location? With only 4% of journeys starting and stopping at the same station, this suggests that most rentals are for travel from one location to another.

10. What are the geolocations (latitude, longitude) of each station? These results were obtained for possible mapping purposes in future.

Copyright © 2023, Brandon O'Donnell

LinkedInGitHubEmail