Main Project n 33
-
Upload
imran-hossain -
Category
Documents
-
view
17 -
download
13
description
Transcript of Main Project n 33
![Page 1: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/1.jpg)
1. Introduction
1.1Origin of the Report:
BUS 511 is a statistics course offered in the MBA program of NSU in order to equip students
with the statistical tools. The project was initiated so that the students would get a practical
exposure of statistical analysis in a project work. Different types of statistical tools were used
in this project to find out the results.
1.2 Problem Statement:
Automobile is an important and fast growing industry around the globe. So the selling price
of a car is always a good interest for people. In this report we showed different variables of
cars, which are affecting the selling price of a car. We have used different car models and
different models as our sample data. There are many variables that affect the selling price of
a branded car. We have chosen 4 of these for analyzing the selling price of the 33 different
models of car.
Here in this paper a model is to be set up to establish the relationship among the variables
and the different car’s selling price. The variables are used in this report are given below,
Engine displacement- Cubic Centimeters (CC)
Horse power (HP)
Fuel Miles per gallon (MPG)
Preferred Package Accessories ($)
Wheel /Drive
1.3 Objectives of the study:
To find out the level of impact and relationship between Cubic centimeters and Car’s
selling price.
To find out the level of impact and relationship between Horse power and Car’s
selling price.
![Page 2: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/2.jpg)
To find out the level of impact and relationship between Fuel miles per gallon and
Car’s selling price.
To find out the level of impact and relationship between preferred package
accessories and Car’s selling price.
To find out the level of impact and relationship between Wheel drive and Car’s
selling price.
Regression analysis of 5 independent variables with the dependent variable
Testing usefulness of the model
Testing partial regression co efficient
Testing correlation co efficient
To get a practical exposure of statistical analysis
1.4 Methodology:
The data used in this report is collected from different car showrooms in the city. These
include the sole agents of the company in the city such as Pacific Motors Bd for Nissan and
Hyundai, Navana 3s for Toyota, Honda and some local car dealers. The 33 car models are
used here as a sample variables. After collecting the data we analyzed the data with the help
of statistical software (Minitab 15 and 17). The collected data was first summarized and
presented graphically. Then we tested some hypotheses about the population mean for each
of the variables. After that, we calculated the correlations by using Minitab software among
different variables were, to see the strength of their relationship. Then we tested hypothesis
of correlation coefficient. Then we built a few simple regression equations for Total Pages
Serve and the independent variables. Then we extended the relationships to a multiple
regression model. After that we tested some hypothesis of partial regression coefficient and
finally we tested the usefulness of the regression model.
![Page 3: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/3.jpg)
2. Background
2.1 History of the Automobile industry
The history of the automobile begins as early as 1769, with the creation of steam engine
automobiles capable of human transport. In 1806, the first cars powered by an internal
combustion engine running on fuel gas appeared, which led to the introduction in 1885 of the
ubiquitous modern gasoline- or petrol-fueled internal combustion engine. Cars powered by
electric power briefly appeared at the turn of the 20th century, but largely disappeared from
use until the turn of the 21st century. The early history of the automobile can be divided into
a number of eras, based on the prevalent means of propulsion. Later periods were defined by
trends in exterior styling, and size and utility preferences.
2.2 Global Automobile Sales:
![Page 4: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/4.jpg)
Fig: Global automobiles sales in 2013
2.3 Car brands used as sample data in the analysis:
![Page 5: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/5.jpg)
3. Variables
3.1 Explanation of test parameters
There are total 6 variables in this project. Among them 1 is dependent variable and other 5 is
independent variables. Car selling is always been an interesting thing for the one who wants
to buy it. So Car selling price is our dependent variable in this report. 5 variables are
affecting the car selling price, so these are the independent variables. These independent
variables are given below:
Engine Displacement- Cubic Centimeters (CC)
Horse power (HP)
Fuel Miles per gallon (MPG)
Wheel /Drive
3.2 Dependent variable
In our case a branded car’s selling price is the dependent variable. The price of the car at the
showroom is the selling price. This is a dependent variable, because it may be affected by
several independent variables.
3.3 Independent variables
Factors that are affecting the car selling price are the independent variables. We have 4
independent variables for this report.
Cubic Centimeters (CC)
Cubic Centimeters is the total volume of all cylinders at full stroke. In cars its ci's Cubic
Inches. The higher the cc's, the larger and more powerful the engine.
Horse power (HP)
Horsepower (hp) is the name of several units of measurement of power. Horsepower was
originally defined to compare the output of steam engines with the power of draft horses in
![Page 6: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/6.jpg)
continuous operation. The unit was widely adopted to measure the output of piston engines,
turbines, electric motors, and other machinery. The definition of the unit varied between
geographical regions. Most countries now use the SI unit watt for measurement of power.
Fuel Miles per gallon (MPG)
Efficiency is defined as output per input. In automobiles it is the distance traveled per unit of
fuel used; in miles per gallon (mpg) or kilometers per liter (km/L), commonly used in the
UK, US (mpg) and Japan, Korea, India, Pakistan, parts of Africa, The Netherlands, Denmark
and Latin America (km/L). If mpg is used the gallon should be identified.
Wheel /Drive
A drive wheel is a road wheel in an automotive vehicle that receives torque from the power
train, and provides the final driving force for a vehicle. A two-wheel drive vehicle has two
driven wheels, and a four-wheel drive has four, and so-on. A steer wheel is one that turns to
change the direction of a vehicle. A trailer wheel is one that is neither a drive wheel nor a
steer wheel.
Two wheel drive
For four-wheeled vehicles, this term is used to describe vehicles that are able to transmit
torque to at most two road wheels, referred to as either front- or rear-wheel drive. The term
4x2 is also used, to indicate four total road-wheels with two being driven.
Four-wheel drive or All-wheel drive
Four-wheel drive, 4WD, 4x4 ("four-by-four"), all-wheel drive, and AWD are terms used to
describe a four-wheeled vehicle with a drive train that allows all four road wheels to receive
torque from the internal combustion engine simultaneously. While some people associate the
term with off-road vehicles - powering all four wheels provides better control, and therefore
safety on slick ice, and is an important part of rally racing on mostly-paved roads.
![Page 7: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/7.jpg)
Front-wheel drive
Front-wheel drive (or FWD for short) is the most common form of internal combustion
engine / transmission layout used in modern passenger cars, where the engine drives the front
wheels. Most front wheel drive vehicles today feature transverse engine mounting, whereas
in past decades engines were mostly positioned longitudinally instead. Rear-wheel drive was
the traditional standard, and is still widely used in luxury cars and most sport cars. Four-
wheel drive is also sometimes used. See also Front-engine, front-wheel drive layout.
Rear-wheel drive
Rear-wheel drive (or RWD for short) was a common internal combustion engine /
transmission layout used in automobiles throughout the 20th century.
4. Statistical Approaches
4.1 Theoretical Model:
Dependent variable: Car’s selling price (Y)
Independent variable: X1, X2, X3, X4
Car selling price, Y= f (X1, X2, X3, X4)
The analysis would be based on different variables of cars and the internal relationship of
their characteristics with the car’s selling price.
![Page 8: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/8.jpg)
4.2 Regression Model:
A multiple regression equation was drawn as follows on the basis of Least Square Method:
Ŷ = β0+β1x1+β2x2+β3x3+β4x4
Where, Ŷ = Car selling price ($)
X1= Cubic Centimeters (CC)
X2 = Horse power (HP)
X3 = Fuel Miles per gallon (MPG)
X4 = Wheel /Drive
4.3 Hypothesis:
H1: Cubic Centimeters (CC) has impact on car selling price
H2: Horse power (HP) has impact on car selling price
H3: Fuel Miles per gallon (MPG) has impact on car selling price
H4: Wheel /Drive has impact on car selling price
4.4 Sample size
Considering time and other limitations, we found that it would be most appropriate to work
with 32 car model of different brands.
Number of observations, n= 33
Variables: {X1, X2, X3, X4, }
![Page 9: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/9.jpg)
4.5 Data Sheet
No. Car Model Selling Price in BDT
CC HP Fuel (MPG)
Wheel /Drive
1 2013 NISSAN PATROL
16500000 5700 381 15 4
2 2012 NISSAN MURANO
9500000 4000 270 19 4
3 2012 Toyota Premio G
2850000 1500 135 28 2
4 2012 Toyota Allion
2800000 1500 135 28 2
5 2012 NISSAN SUNNY
1650000 1300 132 25 2
6 2013Toyota Yaris
1750000 1299 132 30 2
7 2013 Toyota Prius Hybrid
3450000 1800 165 65 2
8 2013 Toyota Camry Hybrid
8200000 2500 231 66 4
9 2012 NISSAN SYLPHY
2300000 2000 132 46 2
10 2012 NISSAN BLUEBIRD
2650000 1800 98 50 4
11 Kia Sportage 2013
5200000 2400 115 39 4
12 2012 NISSAN X-TRAIL
6400000 1800 98 42 2
13 2012 NISSAN CEFIRO
4550000 2500 179 24 2
14 Toyota Avanza
1450000 1300 132 30 2
15 2012 NISSAN PATHFINDER Hybrid
4500000 3500 266 21 4
16 2013 Toyota Rav4
4200000 2362 159 26 4
![Page 10: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/10.jpg)
17 Toyota Landcruiser 200
40000000 4500 310 15 4
18 Toyota Prado 2013
13200000 2982 182 21 4
19 2012 NISSAN SUNNY 1.5
1750000 1500 132 22 2
20 2012 Toyota Fortuner
9000000 2694 270 17 4
21 2011 NISSAN DUALIS
5700000 3500 268 20 2
22 2011 NISSAN TEANA
2250000 2500 169 22 2
23 2013 Hyundai Sonata
4500000 2400 179 28 2
24 Hyundai i10 1500000 1200 105 30 225 2011 NISSAN
SKYLINE5200000 3500 270 17 4
26 Hyundai Eon 1150000 814 95 35 227 2013 KIA
optima6300000 2400 175 27 2
28 Toyota Vista 2000
1700000 1800 132 22 2
29 Toyota Corolla G 2012
1600000 1600 127 28 2
30 Honda 2014 CRV
8400000 2500 179 22 4
31 Honda City 1950000 1300 120 30 232 Honda Accord
20132800000 2400 185 24 2
33 Mitsubishi Pajero Sport 2013
6900000 2700 175 25 4
![Page 11: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/11.jpg)
4.6 Graphs
4.6.1Histogram: A histogram is a graphical representation of the distribution of data. It is an estimate of the probability distribution of a continuous variable. A histogram is a representation of tabulated frequencies, shown as adjacent rectangles, erected over discrete intervals, with an area proportional to the frequency of the observations in the interval. The total area of the histogram is equal to the number of data.
400000003000000020000000100000000-10000000
16
14
12
10
8
6
4
2
0
Mean 5813636StDev 7094719N 33
Selling Price in BDT
Freq
uenc
y
Histogram of Selling Price in BDTNormal
![Page 12: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/12.jpg)
500040003000200010000
12
10
8
6
4
2
0
Mean 2350StDev 1052N 33
CC
Freq
uenc
yHistogram of CC
Normal
40032024016080
12
10
8
6
4
2
0
Mean 176.8StDev 69.52N 33
HP
Freq
uenc
y
Histogram of HPNormal
![Page 13: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/13.jpg)
6050403020100
9
8
7
6
5
4
3
2
1
0
Mean 29.06StDev 12.48N 33
Fuel (MPG)
Freq
uenc
yHistogram of Fuel (MPG)
Normal
54321
20
15
10
5
0
Mean 2.788StDev 0.9924N 33
Wheel /Drive
Freq
uenc
y
Histogram of Wheel /DriveNormal
![Page 14: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/14.jpg)
4.6.2 Scatter diagram: The scatterplot is widely used to present measurements of two or more related variables. It is particularly useful when the variables of the y-axis are thought to be dependent upon the values of the variable of the x-axis (usually an independent variable).In a scatterplot, the data points are plotted but not joined; the resulting pattern indicates the type and strength of the relationship between two or more variables.
600050004000300020001000
40000000
30000000
20000000
10000000
0
CC
Sellin
g Pr
ice in
BDT
Scatterplot of Selling Price in BDT vs CC
![Page 15: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/15.jpg)
400350300250200150100
40000000
30000000
20000000
10000000
0
HP
Sellin
g Pr
ice in
BDT
Scatterplot of Selling Price in BDT vs HP
70605040302010
40000000
30000000
20000000
10000000
0
Fuel (MPG)
Sellin
g Pr
ice in
BDT
Scatterplot of Selling Price in BDT vs Fuel (MPG)
![Page 16: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/16.jpg)
4.03.53.02.52.0
40000000
30000000
20000000
10000000
0
Wheel /Drive
Sellin
g Pr
ice in
BDT
Scatterplot of Selling Price in BDT vs Wheel /Drive
4.6.3Probability Plot: The normal probability plot is a graphical technique for normality testing: assessing whether or not a data set is approximately normally distributed. The data are plotted against a theoretical distribution in such a way that the points should form approximately a straight line. Departures from this straight line indicate departures from the specified distribution.
![Page 17: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/17.jpg)
400000003000000020000000100000000-10000000-20000000
99
9590
80706050403020
105
1
Mean 5813636StDev 7094719N 33AD 3.855P-Value <0.005
Selling Price in BDT
Perc
ent
Probability Plot of Selling Price in BDTNormal - 95% CI
6000500040003000200010000-1000
99
9590
80706050403020
105
1
Mean 2350StDev 1052N 33AD 0.973P-Value 0.013
CC
Perc
ent
Probability Plot of CCNormal - 95% CI
![Page 18: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/18.jpg)
4003002001000
99
9590
80706050403020
105
1
Mean 176.8StDev 69.52N 33AD 1.574P-Value <0.005
HP
Perc
ent
Probability Plot of HPNormal - 95% CI
706050403020100-10
99
9590
80706050403020
105
1
Mean 29.06StDev 12.48N 33AD 2.029P-Value <0.005
Fuel (MPG)
Perc
ent
Probability Plot of Fuel (MPG)Normal - 95% CI
![Page 19: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/19.jpg)
6543210
99
9590
80706050403020
105
1
Mean 2.788StDev 0.9924N 33AD 6.124P-Value <0.005
Wheel /Drive
Perc
ent
Probability Plot of Wheel /DriveNormal - 95% CI
4.6.4 Dot Plot: The dot plot as a representation of a distribution consists of group of data points plotted on a simple scale. Dot plots are used for continuous, quantitative, univariate data. Data points may be labelled if there are few of them. Dot plots are one of the simplest statistical plots, and are suitable for small to moderate sized data sets. They are useful for highlighting clusters and gaps, as well as outliers. Their other advantage is the conservation of numerical information.
![Page 20: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/20.jpg)
360000003000000024000000180000001200000060000000Selling Price in BDT
Dotplot of Selling Price in BDT
5600490042003500280021001400700CC
Dotplot of CC
![Page 21: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/21.jpg)
360320280240200160120HP
Dotplot of HP
432Wheel /Drive
Dotplot of Wheel /Drive
![Page 22: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/22.jpg)
4.6.5 BOX PLOT: A box plot is a convenient way of graphically depicting groups of numerical data through their quartiles. Box plots may also have lines extending vertically from the boxes (whiskers) indicating variability outside the upper and lower quartiles, hence the terms box-and-whisker plot and box-and-whisker diagram. Outliers may be plotted as individual points. Box plots display differences between populations without making any assumptions of the underlying statistical distribution: they are non-parametric. The spacings between the different parts of the box help indicate the degree of dispersion (spread) and skewness in the data, and identify outliers.
40000000
30000000
20000000
10000000
0
Sellin
g Pr
ice in
BDT
Boxplot of Selling Price in BDT
![Page 23: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/23.jpg)
6000
5000
4000
3000
2000
1000
CCBoxplot of CC
400
350
300
250
200
150
100
HP
Boxplot of HP
![Page 24: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/24.jpg)
70
60
50
40
30
20
10
Fuel
(MPG
)
Boxplot of Fuel (MPG)
4.0
3.5
3.0
2.5
2.0
Whe
el /D
rive
Boxplot of Wheel /Drive
![Page 25: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/25.jpg)
5. Descriptive statistics
5.1 Descriptive Statistics: Selling Price, CC, HP, Fuel (MPG), wheel drive
Descriptive Statistics: Selling Price in BDT, CC, HP, Fuel (MPG), Wheel /Drive
Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3Selling Price in BDT 33 0 5813636 1235032 7094719 1150000 1850000 4200000 6650000CC 33 0 2350 183 1052 814 1500 2400 2697HP 33 0 176.8 12.1 69.5 95.0 132.0 165.0 208.0Fuel (MPG) 33 0 29.06 2.17 12.48 15.00 21.50 26.00 30.00Wheel /Drive 33 0 2.788 0.173 0.992 2.000 2.000 2.000 4.000
Variable MaximumSelling Price in BDT 40000000CC 5700HP 381.0Fuel (MPG) 66.00Wheel /Drive 4.000
![Page 26: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/26.jpg)
5.2 Summary
1st Quartile 1850000Median 42000003rd Quartile 6650000Maximum 40000000
3297958 8329314
2474103 5451281
5705496 9384136
A-Squared 3.85P-Value <0.005Mean 5813636StDev 7094719Variance 5.03350E+13Skewness 3.8000Kurtosis 17.3185N 33Minimum 1150000
Anderson-Darling Normality Test
95% Confidence Interval for Mean
95% Confidence Interval for Median
95% Confidence Interval for StDev
400000003000000020000000100000000
Median
Mean
8000000700000060000005000000400000030000002000000
95% Confidence Intervals
Summary Report for Selling Price in BDT
![Page 27: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/27.jpg)
1st Quartile 1500.0Median 2400.03rd Quartile 2697.0Maximum 5700.0
1976.9 2723.1
1800.0 2500.0
846.2 1391.8
A-Squared 0.97P-Value 0.013Mean 2350.0StDev 1052.3Variance 1107243.8Skewness 1.26155Kurtosis 2.06613N 33Minimum 814.0
Anderson-Darling Normality Test
95% Confidence Interval for Mean
95% Confidence Interval for Median
95% Confidence Interval for StDev
50004000300020001000
Median
Mean
280026002400220020001800
95% Confidence Intervals
Summary Report for CC
![Page 28: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/28.jpg)
1st Quartile 132.00Median 165.003rd Quartile 208.00Maximum 381.00
152.11 201.41
132.00 179.00
55.91 91.96
A-Squared 1.57P-Value <0.005Mean 176.76StDev 69.52Variance 4833.44Skewness 1.17036Kurtosis 0.94583N 33Minimum 95.00
Anderson-Darling Normality Test
95% Confidence Interval for Mean
95% Confidence Interval for Median
95% Confidence Interval for StDev
40032024016080
Median
Mean
200180160140
95% Confidence Intervals
Summary Report for HP
![Page 29: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/29.jpg)
1st Quartile 21.500Median 26.0003rd Quartile 30.000Maximum 66.000
24.634 33.488
22.000 29.005
10.040 16.514
A-Squared 2.03P-Value <0.005Mean 29.061StDev 12.485Variance 155.871Skewness 1.71811Kurtosis 2.90649N 33Minimum 15.000
Anderson-Darling Normality Test
95% Confidence Interval for Mean
95% Confidence Interval for Median
95% Confidence Interval for StDev
6050403020
Median
Mean
34323028262422
95% Confidence Intervals
Summary Report for Fuel (MPG)
![Page 30: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/30.jpg)
1st Quartile 2.0000Median 2.00003rd Quartile 4.0000Maximum 4.0000
2.4360 3.1398
2.0000 4.0000
0.7981 1.3126
A-Squared 6.12P-Value <0.005Mean 2.7879StDev 0.9924Variance 0.9848Skewness 0.45507Kurtosis -1.91285N 33Minimum 2.0000
Anderson-Darling Normality Test
95% Confidence Interval for Mean
95% Confidence Interval for Median
95% Confidence Interval for StDev
432
Median
Mean
4.03.53.02.52.0
95% Confidence Intervals
Summary Report for Wheel /Drive
6. Regression Analysis: A regression analysis is a statistical process for estimating the relationships among variables. It includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables. More specifically, regression analysis helps one understand how the typical value of the dependent variable changes when any one of the independent variables is varied, while the other independent variables are held fixed. In regression analysis, it is also of interest to characterize the variation of the dependent variable around the regression function which can be described by a probability distribution. The p-value for each term tests the null hypothesis that the coefficient is equal to zero (no effect). A low p-value (< 0.05) indicates that you can reject the null hypothesis. In other words, a predictor that has a low p-value is likely to be a meaningful addition to your model because changes in the predictor's value are related to changes in the response variable. Conversely, a larger (insignificant) p-value suggests that changes in the predictor are not associated with changes in the response. Typically, we use the coefficient p-values to determine which terms to keep in the regression model.
![Page 31: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/31.jpg)
Regression Analysis: Selling Price in BDT versus CC, HP, Fuel (MPG), Wheel /Drive
Regression Equation:
Selling Price in BDT = -6269746 + 4113 CC + 480 HP - 4486 Fuel (MPG) + 883610 Wheel /Drive
Explanation:
βo = -6269746, it will always remain constant.
For a single unit change of CC, the Car Selling Price will be changed 4113 units, and the variables share a positive relationship to each other.
For a single unit change of HP, the car Selling Price will be changed 480units, and the variables share a positive relationship to each other.
For a single unit change of Fuel (MPG), the car Selling Price will be changed 446 units, and the variables share a negative relationship to each other.
For a single unit change of Wheel/Drive, the Car Selling Price will be changed 883610units, and the variables share a positive relationship to each other.
Predictor Coef SE Coef T-value P-value
Constant -6269746 4584915 -1.37 0.182
CC 4113 2641 1.56 0.131
HP 480 36687 0.01 0.990
Fuel (MPG) -4486 86150 -0.05 0.959
Wheel/Drive 883610 1276072 0.69 0.494
Regression Table
S = 5398952 R-Sq = 49.33% R-Sq(adj) = 42.09% R-sq(pred) = 27.36%
The coefficient of determination (R2) and the adjusted value was found to be 49.33% and 42.09% respectively. That means the Selling Price can be explained 49.33% by CC, HP, Fuel (MPG) and Wheel/Drive.
Minitab Output:
![Page 32: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/32.jpg)
Regression Equation
Selling Price in BDT = -6269746 + 4113 CC + 480 HP - 4486 Fuel (MPG) + 883610 Wheel /Drive
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-ValueRegression 4 7.94558E+14 1.98640E+14 6.81 0.001 CC 1 7.06797E+13 7.06797E+13 2.42 0.131 HP 1 4993542516 4993542516 0.00 0.990 Fuel (MPG) 1 79043102586 79043102586 0.00 0.959 Wheel /Drive 1 1.39762E+13 1.39762E+13 0.48 0.494Error 28 8.16163E+14 2.91487E+13 Lack-of-Fit 27 8.16162E+14 3.02282E+13 24182.58 0.005 Pure Error 1 1250000000 1250000000Total 32 1.61072E+15
Model Summary
S R-sq R-sq(adj) R-sq(pred)5398952 49.33% 42.09% 27.36%
Coefficients
Term Coef SE Coef T-Value P-Value VIFConstant -6269746 4584915 -1.37 0.182CC 4113 2641 1.56 0.131 8.48HP 480 36687 0.01 0.990 7.14Fuel (MPG) -4486 86150 -0.05 0.959 1.27Wheel /Drive 883610 1276072 0.69 0.494 1.76
Fits and Diagnostics for Unusual Observations
Selling StdObs Price in BDT Fit Resid Resid 8 8200000 7361820 838180 0.23 X 17 40000000 15854387 24145613 4.89 R
R Large residualX Unusual X
![Page 33: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/33.jpg)
600050004000300020001000
40000000
30000000
20000000
10000000
0
S 5175848R-Sq 48.4%R-Sq(adj) 46.8%
CC
Sellin
g Pr
ice in
BDT
Fitted Line PlotSelling Price in BDT = - 5214286 + 4693 CC
400350300250200150100
40000000
30000000
20000000
10000000
0
S 5533401R-Sq 41.1%R-Sq(adj) 39.2%
HP
Sellin
g Pr
ice in
BDT
Fitted Line PlotSelling Price in BDT = - 5746294 + 65400 HP
![Page 34: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/34.jpg)
70605040302010
40000000
30000000
20000000
10000000
0
S 6917845R-Sq 7.9%R-Sq(adj) 4.9%
Fuel (MPG)
Sellin
g Pr
ice in
BDT
Fitted Line PlotSelling Price in BDT = 10453817 - 159673 Fuel (MPG)
4.03.53.02.52.0
40000000
30000000
20000000
10000000
0
S 6184330R-Sq 26.4%R-Sq(adj) 24.0%
Wheel /Drive
Sellin
g Pr
ice in
BDT
Fitted Line PlotSelling Price in BDT = - 4425385 + 3672692 Wheel /Drive
![Page 35: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/35.jpg)
7. Correlations: The correlation coefficient is a measure of linear association between two variables. Values of the correlation coefficient are always between -1 and +1. A correlation coefficient of +1 indicates that two variables are perfectly related in a positive linear sense; a correlation coefficient of -1 indicates that two variables are perfectly related in a negative linear sense, and a correlation coefficient of 0 indicates that there is no linear relationship between the two variables.
Correlation: Selling Price in BDT, CC
Pearson correlation of Selling Price in BDT and CC = 0.696P-Value = 0.000
Correlation: Selling Price in BDT, HP
Pearson correlation of Selling Price in BDT and HP = 0.641P-Value = 0.000
Correlation: Selling Price in BDT, Fuel (MPG)
Pearson correlation of Selling Price in BDT and Fuel (MPG) = -0.281P-Value = 0.113
Correlation: Selling Price in BDT, Wheel /Drive
Pearson correlation of Selling Price in BDT and Wheel /Drive = 0.514P-Value = 0.002
8. One way ANOVAs: In statistics, one-way analysis of variance (one-way ANOVA) is a
technique used to compare means of two or more samples (using the F distribution). This technique can be used only for numerical data.
The ANOVA tests the null hypothesis that samples in two or more groups are drawn from populations with the same mean values. To do this, two estimates are made of the population variance. These estimates rely on various assumptions. The ANOVA produces an F-statistic, the ratio of the variance calculated among the means to the variance within the samples. If the group means are drawn from populations with the same mean values, the variance between the group means should be lower than the variance of the samples, following the central limit theorem. A higher ratio therefore implies that the samples were drawn from populations with different mean values.
![Page 36: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/36.jpg)
One-way ANOVA: Selling Price in BDT versus CC
Method
Null hypothesis All means are equalAlternative hypothesis At least one mean is differentSignificance level α = 0.05
Equal variances were assumed for the analysis.
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-ValueCC 17 1.56360E+15 9.19763E+13 29.28 0.000Error 15 4.71250E+13 3.14167E+12Total 32 1.61072E+15
Model Summary
S R-sq R-sq(adj) R-sq(pred)1772475 97.07% 93.76% *
Means
CC N Mean StDev 95% CI814 1 1150000 * (-2627940, 4927940)1200 1 1500000 * (-2277940, 5277940)1299 1 1750000 * (-2027940, 5527940)1300 3 1683333 251661 ( -497862, 3864528)1500 3 2466667 621155 ( 285472, 4647862)1600 1 1600000 * (-2177940, 5377940)1800 4 3550000 2030189 ( 1661030, 5438970)2000 1 2300000 * (-1477940, 6077940)2362 1 4200000 * ( 422060, 7977940)2400 4 4700000 1467424 ( 2811030, 6588970)2500 4 5850000 2981890 ( 3961030, 7738970)2694 1 9000000 * ( 5222060, 12777940)2700 1 6900000 * ( 3122060, 10677940)2982 1 13200000 * ( 9422060, 16977940)3500 3 5133333 602771 ( 2952138, 7314528)4000 1 9500000 * ( 5722060, 13277940)4500 1 40000000 * (36222060, 43777940)5700 1 16500000 * (12722060, 20277940)
Pooled StDev = 1772475
![Page 37: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/37.jpg)
One-way ANOVA: Selling Price in BDT versus HP
Method
Null hypothesis All means are equalAlternative hypothesis At least one mean is differentSignificance level α = 0.05
Equal variances were assumed for the analysis.
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-ValueHP 20 1.58203E+15 7.91017E+13 33.09 0.000Error 12 2.86875E+13 2.39062E+12Total 32 1.61072E+15
Model Summary
S R-sq R-sq(adj) R-sq(pred)1546165 98.22% 95.25% *
Means
HP N Mean StDev 95% CI95 1 1150000 * (-2218803, 4518803)98 2 4525000 2651650 ( 2142896, 6907104)105 1 1500000 * (-1868803, 4868803)115 1 5200000 * ( 1831197, 8568803)120 1 1950000 * (-1418803, 5318803)127 1 1600000 * (-1768803, 4968803)132 6 1766667 284019 ( 391358, 3141975)135 2 2825000 35355 ( 442896, 5207104)159 1 4200000 * ( 831197, 7568803)165 1 3450000 * ( 81197, 6818803)169 1 2250000 * (-1118803, 5618803)175 2 6600000 424264 ( 4217896, 8982104)179 3 5816667 2237372 ( 3871687, 7761646)182 1 13200000 * ( 9831197, 16568803)185 1 2800000 * ( -568803, 6168803)231 1 8200000 * ( 4831197, 11568803)266 1 4500000 * ( 1131197, 7868803)268 1 5700000 * ( 2331197, 9068803)270 3 7900000 2351595 ( 5955021, 9844979)310 1 40000000 * (36631197, 43368803)381 1 16500000 * (13131197, 19868803)
Pooled StDev = 1546165
![Page 38: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/38.jpg)
One-way ANOVA: Selling Price in BDT versus Fuel (MPG)
Method
Null hypothesis All means are equalAlternative hypothesis At least one mean is differentSignificance level α = 0.05
Equal variances were assumed for the analysis.
Factor Information
Factor Levels ValuesFuel (MPG) 19 15, 17, 19, 20, 21, 22, 24, 25, 26, 27, 28, 30, 35, 39, 42, 46, 50, 65, 66
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-ValueFuel (MPG) 18 1.23793E+15 6.87738E+13 2.58 0.039Error 14 3.72794E+14 2.66281E+13Total 32 1.61072E+15
Model Summary
S R-sq R-sq(adj) R-sq(pred)5160245 76.86% 47.10% *Means
Fuel(MPG) N Mean StDev 95% CI15 2 28250000 16617009 (20424008, 36075992)17 2 7100000 2687006 ( -725992, 14925992)19 1 9500000 * (-1567624, 20567624)20 1 5700000 * (-5367624, 16767624)21 2 8850000 6151829 ( 1024008, 16675992)22 4 3525000 3259473 (-2008812, 9058812)24 2 3675000 1237437 (-4150992, 11500992)25 2 4275000 3712311 (-3550992, 12100992)26 1 4200000 * (-6867624, 15267624)27 1 6300000 * (-4767624, 17367624)28 4 2937500 1191200 (-2596312, 8471312)30 4 1662500 232289 (-3871312, 7196312)35 1 1150000 * (-9917624, 12217624)39 1 5200000 * (-5867624, 16267624)42 1 6400000 * (-4667624, 17467624)46 1 2300000 * (-8767624, 13367624)50 1 2650000 * (-8417624, 13717624)65 1 3450000 * (-7617624, 14517624)66 1 8200000 * (-2867624, 19267624)
Pooled StDev = 5160245
![Page 39: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/39.jpg)
One-way ANOVA: Selling Price in BDT versus Wheel /Drive
Method
Null hypothesis All means are equalAlternative hypothesis At least one mean is differentSignificance level α = 0.05Equal variances were assumed for the analysis.Factor Information
Factor Levels ValuesWheel /Drive 2 2, 4
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-ValueWheel /Drive 1 4.25097E+14 4.25097E+14 11.11 0.002Error 31 1.18562E+15 3.82459E+13Total 32 1.61072E+15
Model Summary
S R-sq R-sq(adj) R-sq(pred)6184330 26.39% 24.02% 13.83%
Means
Wheel/Drive N Mean StDev 95% CI2 20 2920000 1676415 ( 99642, 5740358)4 13 10265385 9713508 (6767161, 13763608)
Pooled StDev = 6184330
![Page 40: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/40.jpg)
9. One-Sample Z Test
One-Sample Z (Car Selling Price)
Test of μ = 5800000 vs ≠ 5800000The assumed standard deviation = 7094719
Variable N Mean StDev SE Mean 95% CI Z PSelling Price in BDT 33 5813636 7094719 1235032 (3393018, 8234255) 0.01 0.991
One-Sample Z: CC
Test of μ = 2300 vs ≠ 2300The assumed standard deviation = 1052
Variable N Mean StDev SE Mean 95% CI Z PCC 33 2350 1052 183 (1991, 2709) 0.27 0.785
One-Sample Z: HP
Test of μ = 176 vs ≠ 176The assumed standard deviation = 69.52
Variable N Mean StDev SE Mean 95% CI Z PHP 33 176.8 69.5 12.1 (153.0, 200.5) 0.06 0.950
One-Sample Z: Fuel (MPG)
Test of μ = 29 vs ≠ 29The assumed standard deviation = 12.48
Variable N Mean StDev SE Mean 95% CI Z PFuel (MPG) 33 29.06 12.48 2.17 (24.80, 33.32) 0.03 0.978
One-Sample Z: Wheel /Drive
Test of μ = 2 vs ≠ 2The assumed standard deviation = 0.9942
![Page 41: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/41.jpg)
Variable N Mean StDev SE Mean 95% CI Z PWheel /Drive 33 2.788 0.992 0.173 (2.449, 3.127) 4.55 0.000
10. Hypothesis testing: Hypothesis testing or significance testing is a method for testing a claim or hypothesis about a parameter in a population, using data measured in a sample. In this method, we test some hypothesis by determining the likelihood that a sample statistic could have been selected, if the hypothesis regarding the population parameter were true.
10.1 Hypothesis test for Mean
1. Car selling price
Mean (x) = 5800000, Standard Deviation (S) = 7094719, n = 33Ho: µ = 5800000HA: µ ≠ 5800000
Test Statistic:
z = x - µo / s √ n
With α = .05
And p value 0.991, which is greater than .05
Hence the Null Hypothesis Ho is not rejected.
Population mean of car selling price is equal to BDT 5800000.
2. CC
Mean (x) = 2300, Standard Deviation (S) =1052, n = 33
Ho: µ = 2300HA: µ ≠ 2300
Test Statistic:
z = x - µo / s √ n
With α = .05
And p value 0.785, which is greater than .05
Hence the Null Hypothesis Ho is not rejected
![Page 42: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/42.jpg)
Population mean of CC is equal to 2300.
3. HP
Mean (x) = 176, Standard Deviation (S) = 69.52, n = 33
Ho: µ = 176HA: µ ≠ 176
Test Statistic:
z = x - µo / s √ n
With α = .05
And p value 0.950, which is greater than .05Hence the Null Hypothesis Ho is not rejected
Population mean of HP is equal to 176
4. Fuel (MPG)
Mean (x) = 29, Standard Deviation (S) = 29.061, n = 33
Ho: µ = 29HA: µ ≠ 29
Test Statistic:
z = x - µo / s √ n
With α = .05
And p value 0.978, which is greater than .05
Hence the Null Hypothesis Ho is not rejected
Population mean of Fuel (MPG) is equal to 25.
![Page 43: Main Project n 33](https://reader034.fdocuments.us/reader034/viewer/2022042721/577cc6ba1a28aba7119eff9b/html5/thumbnails/43.jpg)
5. Wheel drive
Mean (x) = 2, Standard Deviation (S) = 0.9942, n = 33
Ho: µ = 2HA: µ ≠ 2
Test Statistic:
z = x - µo / s √ n
With α = .05
And p value 0.000, which is less than .05
Hence reject the Null Hypothesis Ho
Population mean of Wheel drive is not equal to 2.