MS9004: ISMMS9004 AssignmentBACKGROUND BRIEFThe U.S. Department of Transportations (DOT) Bureau of Transportation Statistics tracksthe on-time performance of domestic flights operated by large air carriers. The DOT publishmonthly Air Travel Consumer Report and this dataset consists of a small subset of flightdelays in 2015.There are 18 variables in this dataset, each described below. The objective in this scenariois to build statistical models to predict the response variable, ARRIVAL_DELAY. Variable DescriptionMONTH Month of the flight trip.DAY Day of the flight trip.DAY_OF_WEEK Day of week of the flight trip. (1: Monday, , 7: Sunday)AIRLINE Airline code.DEPARTURE_DELAY Total delay on departure, in minutes.(Negative value means early departure.)TAXI_OUT The time duration, in minutes, between departure from the originairport gate and wheels-off.SCHEDULED_TIME Planned time amount, in minutes, needed for the flight trip.AIR_TIME The time elapsed, in minutes, between wheels-off and wheels-on.DISTANCE Distance travelled between two airports.TAXI_IN The time duration, in minutes, between wheels-on and gate arrivalat destination airport.AIR_SYSTEM_DELAY Delay caused by air system, in minutes.SECURITY_DELAY Delay caused by security, in minutes.AIRLINE_DELAY Delay caused by the airline, in minutes.LATE_AIRCRAFT_DELAY Delay caused by the aircraft, in minutes.QUARTER Quarter of the year. (Q1: Jan Mar, , Q4: Oct Dec)WEEKEND Yes if the day is a weekend (i.e. Saturday or Sunday).LOW_COST Yes if the airline is a low-cost carrier.ARRIVAL_DELAY Total delay on arrival, in minutes.(Negative value means early arrival.) Airline code: AA American Airlines NK Spirit AirlinesAS Alaska Airlines OO Skywest AirlinesB6 JetBlue Airways UA United AirlinesDL Delta Airlines WN Southwest Airlines MS9004: ISMINSTRUCTIONS1. Explore DataPerform exploratory data analysis on the variables using the whole data set. Investigatethe factors that could cause arrival delay.Marks attainable = 102. Build & Evaluate ModelSplit the dataset into training set and test set in the ratio 70:30. Set random seed usingthe last 4 digits of your admission number.Use the training set to build 3 types of multiple linear regression models:Model 1 consists of significant quantitative predictors only.Model 2 includes significant qualitative predictors.Model 3 includes all significant predictors and where model assumptionsviolations are suitably addressed.Then, use each model to predict arrival time on the test set. Report the appropriatemodel evaluations.Marks attainable = 303. Interpret & ReportInterpret the 3 types of models.Present your results in a report no more than 10 pages, including any relevant graphs,figures or tables which support your analysis. This report and the Jupyter notebook ORExcel file OR Minitab project which shows your relevant code/output are to besubmitted online by 19 Feb 2019.Marks attainable = 10
The U.S. Department of Transportations
Pssst…We can write an original essay just for you.
Any essay type. Any subject. We will even overcome a 6 hour deadline.
<< SAVE15 >>
Place your first order with code to get 15% discount right away!