IRS Extends Tax Return Due Date From April 15 To May 17

ExtensionTax.com is happy to share that the IRS has announced a last minute change in the Tax due date as April 15 is just around, with less than a month which is now pushed to May 17. Her is the IRS…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Analysis of 5.9 Million Chicago Taxi Trips in 2022 Using Stratified Sampling

As the largest city in Illinois and the 3rd largest city in the United States, with a population of about 2,608,425 people, Chicago is a bustling metropolis with significant transportation demands. Taxi services play a vital role in the city's transportation network. In 2022 alone, nearly 5.9 million taxi trips were recorded in Chicago in Google BigQuery's public data, demonstrating that people in the city continue to rely on taxi services as a means of transportation.

With this background in mind, the Chicago Taxi Company aims to analyze taxi trips in Chicago to better understand the potential trends and demands for 2023. This information will help optimize resource allocation and enhance the company's overall performance. To ensure the results are reliable and representative, we propose using stratified sampling techniques for this project.

We will group the data by pickup community, with each group representing a population. Within each population, we will perform stratified sampling and randomly select samples proportionally from each stratum using optimum allocation. The strata will be divided into weekends and weekdays, while the unit of observation will be the daily sum of trips.

This project serves as the final project for the Sampling Methods course at Sekolah Data Pacmann.

Let’s dive into the analysis below.

The original dataset consists of records for each individual trip, with trips occurring every minute, resulting in a large number of data points. As stated earlier, nearly 5.9 million taxi trips were recorded.

The dataset only contains pickup points, while our analysis requires grouping pickup trips by community area to create new populations. To achieve this, we use GeoPandas to perform a spatial join technique, checking whether the pickup point lies within the community boundaries or not.

After successfully extracting information from the pickup points, we group the trips by their community name and then upsample daily and aggregate the number of trips that occurred each day in each community.

Having prepared the dataframes representing the populations of taxi trips in each community, we can now move on to the stratified sampling process. Let’s take a look at the design of our sampling method.

As shown in the design, after separating the data into 77 populations, we will further stratify the records or day names into weekdays and weekends. To accomplish this, we can use the following function:

Utilizing the information on optimum allocation for each community, we can calculate the mean of daily taxi trips for each community using an unbiased estimator. Here’s the function to achieve this:

unbiased estimator for the population means

Below are the results of the dataframe after applying the function:

Additionally, we can compute the estimated variance, allowing us to calculate the 95% confidence intervals for the mean sample of daily trips from each community.

Here’s our final dataframe, which includes the mean trip samples, estimated variance, margin of error, confidence interval (CI) lower bound, and CI upper bound for each community. We sort the dataframe by the mean trip samples and display the top 10 communities:

The results shows that the stratified sampling method was effective in estimating the mean daily taxi trips for each community. The mean samples, calculated using unbiased estimators, are very close to the mean population values, indicating the accuracy of the method used.

Based on the top 10 communities with the highest mean daily trips, it’s clear that Near North Side, Loop, and Ohare are the busiest communities, with significantly higher daily taxi trips compared to the others. These areas should be prioritized for additional resources and investment, as focusing on these communities could lead to more customers and increased growth for the company.

However, it’s also essential not to neglect other communities in the top 10, such as Near West Side, Near South Side, Lake View, Garfield Ridge, Lincoln Park, Uptown, and West Town. While their daily taxi trips are not as high as the top three, they still represent significant demand and potential for growth. Allocating resources proportionally to these communities can help optimize the company’s operations and ensure better coverage.

In addition to allocating resources, the company can also leverage the findings of this analysis to tailor its marketing strategies and promotional offers. For instance, offering targeted discounts or incentives in these high-demand communities could attract more customers and foster loyalty. By focusing resources and strategies on these communities, the company can effectively tap into the high demand, resulting in increased growth and customer satisfaction.

In conclusion, the stratified sampling method has proven to be a valuable tool for estimating the mean daily taxi trips for each community in Chicago. Our findings provide crucial insights into the busiest communities, allowing for the efficient allocation of resources and the development of targeted marketing strategies to foster growth and customer satisfaction. The top communities, including Near North Side, Loop, and Ohare, as well as other areas within the top 10, represent significant opportunities for expanding the company’s services and increasing its customer base.

It is important to remember that while this analysis has provided valuable insights, the taxi industry remains dynamic, and it is essential to continually monitor and adapt to changes in demand patterns. Regularly updating and refining the analysis can help the company stay ahead of the competition and respond effectively to emerging trends.

In summary, this project demonstrates the power of using stratified sampling methods and data analysis to inform strategic decision-making for a taxi company operating in a large, bustling city like Chicago. By leveraging these insights, the company can optimize its operations, allocate resources effectively, and ultimately achieve higher growth and customer satisfaction.

Thank you!

Add a comment

Related posts:

How to Avoid a Hiring Scam

Scamming is popular. There are Netflix shows and countless podcasts dedicated to the people who were caught scamming. Sometimes their attempts worked several times before getting caught. It might be…

To grow old gracefully and beautifully. Rules from doctors and specialists in aesthetics and nutrition

It seems to us that today public opinion judges us more for the way we age; we blame the social media filters, the ever-higher beauty standards under the influence of entertainment and the beauty…

In Honour of the First Man I Truly Loved

Have you ever experienced an event so surreal that even though it was your reality for however long it lasted, you still sometimes doubt its existence? Every time I think back to that day, I question…