Linear Regresion Toy Model

Linear Regression is a simple tool for modelling the relationship between a scalar dependent variable and one or more explanatory variables, where this relationship is expressed as a linear function of the explanatory variables. If the model assumptions are met, it can predict an increase or decrease in the dependent variable based on the changes in the explanatory variables; e.g., for each $1 that I invest into the production budget of my future movie, how much will I earn on the movie’s total gross? Answering a question like this is not an easy task. Naturally, we would ask: (i) What is the prediction power of our model? (ii) Can we trust the model’s linear coefficient(s)? (iii) What features do we include/omit in our analysis? Could we perform any better, and if so, how does the final model look like?

Read More

Mta Turnstile Data

MTA has publicly available datasets on turnstile activity. The data is recorded weekly. Here I analyze the collected data in a certain time interval (end of April through beginning of June). I identify the most frequent subway stations and the busiest times on a given day of an average week (Monday, Tuesday, .. Sunday). As expected, the commuter hub stations such as 34St - Penn Station, 42St - Grand Central, 34St - Herald Square or 42St - Times Square show most turnstile activity. The goal of this brief investigation is to find other frequent commuter stations that might not be easily identified based on transit patterns.

Read More