A Multiple Regression Analysis on ... - Billy F Lamberti · Lamberti 3 and market activity were...
Transcript of A Multiple Regression Analysis on ... - Billy F Lamberti · Lamberti 3 and market activity were...
A Multiple Regression Analysis on Financial Beta Coefficients
in the Oil and Natural Gas Industry
William Lamberti
Lamberti 1
Introduction
The goal of this report was to find a model that can best explain beta values for the oil
and natural gas industry. We specifically looked at all active stocks with the SIC codes of 4923,
4924, 4925, 4931, 4932, and the 2900’s in the New York Stock Exchange. It was also desirable
to say which variables are the most important in predicting the beta value. In particular, we
wished to know if older companies had safer beta values in comparison to newer firms. The data
was collected from www.mergentonline.com. Additionally, the beta values were collected on
the same day from Yahoo Finance as Mergent does not have this information readily available.
The general approach utilized was a weighted least squares multiple regression fit of the data. It
should be noted that β was denoted as the financial beta statistic of interest and not the multiple
regression parameter coefficient symbol.
It was important to understand what factors might help explain a given beta value for a
stock because betas are heavily relied upon in the financial industry. Firms and individuals
depended on this statistic in one form or another. In particular, individual investment plans
heavily relied upon this statistic in building their portfolio so that they are empowered to invest
for their own needs and desires. Therefore, it was valuable to fully understand what variables
can help to explain a given beta value.
Analysis on beta statistics has been active since its introduction by Sharpe and Litner.1 2
They defined the beta statistic to be
𝛽 =𝐶𝑜𝑣(𝑠𝑎, 𝑠𝑏)
𝑉𝑎𝑟(𝑠𝑏)
1 Lintner, John. "The Valuation of Risk Assets and the Selection of Risky Investments in Stock Portfolios and
Capital Budgets." The Review of Economics and Statistics 47, no. 1 (1965): 13. Accessed May 23, 2015. 2 Sharpe, William F. "Capital Asset Prices: A Theory of Market Equilibrium under Conditions of Risk." The Journal
of Finance XIX, no. 3 (1964): 425. Accessed May 23, 2015.
Lamberti 2
where 𝑠𝑎 and 𝑠𝑏 indicated the two respective stocks or markets of interest. The beta statistic’s
goal was to measure the risk of an investment. Betas that had values of exactly 1 indicated
perfect alignment with the market’s volatility. Betas with less than 1 indicated a stock that was
less volatile than the market, while stocks greater than 1 were more volatile than the market.
Therefore, they indicated safer and riskier stocks respectively. For instance, an investor could
have compared two stocks as a relative risk estimate. However, in the modern age and for what
we were primarily concerned about, the beta was calculated when comparing a stock to the
market. Betas were also used to help diversify portfolios so that investors could achieve the
optimal investment options given a family of indifference curves.3
One of the first analyses on betas was done by Ronald W. Melicher. He performed a
factor analysis and multiple regression on firms operating in the electric utility industry.4 He
chose this sector because of the “implied homogeneity in terms of regulation and business risk”.5
He was motivated to do this in particular because of the few studies that were done on beta
values up to this point. Additionally, some severely suffered from multicollinearity in their other
regression models. Melicher used factor analysis to find variables on which to build his
regression model. During the time, this was a helpful way in building a model as computers
were not nearly as efficient and powerful as they were in recent years. However, factor analysis
was a technique that was not preferable as it could have given biased results. It was built upon
the assumption that you will find latent variables even if those variables do not exist in practical
applications. Nonetheless, Melicher performed factor analysis and found that financial leverage,
size, earnings trend and stability, operating efficiency, financing policy, return on investment,
3 Ibid. 4 Melicher, Ronald W. "Financial Factors Which Influence Beta Variations within an Homogeneous Industry
Environment." The Journal of Financial and Quantitative Analysis, 1974, 231. Accessed May 25, 2015. 5 Ibid.
Lamberti 3
and market activity were able to explain about 85% of the variation in the data. He then built a
few models utilizing multiple and stepwise regression techniques. The 𝑅2’s for the models
ranged between about 33% to 41%.
However, more recent studies focus on if a replacement of the beta statistic should be
altered. Furthermore, it was difficult to find easily accessible similar analyses exploring what
variables help to explain beta values. Still, some time series analyses have been done on the
nature of the beta value. For instance, Chang and Weiss found that beta acts more like a random
variable than as a fixed parameter value.6
For our analysis, the dependent variable was the beta value for the stock. The
independent variables initially were the number of full time employees in 2013, the year that the
company was incorporated, the long term debt, the 2013 revenue, the market cap, the earnings
per share, the dividend per share, and the PE ratio. The general approach utilized was a partial
least squares multiple regression fit of the data. Our analysis differs from others as it is
specifically looking at the variables that help to explain the given value of the beta statistic.
Furthermore, it was on a sector of the economy that was crucial for its daily operation. Without
energy to keep businesses and individuals running, the economy would severely suffer.
Therefore, it was desirable to understand what factors explain the beta statistics in this sector.
Methods:
The general approach entailed a partial least squares multiple regression fit to the data.
We used SAS to perform some of the heavier computational analyses while also using Excel and
R to perform some more elementary calculations and needed analyses. We first checked the
dependent variable correlation and variance inflation to ensure that no multiculinearity existed.
6 Chang, Wei-Chien, and Donald E. Weiss. "An Examination of the Time Series Properties of Beta in the Market
Model." Journal of the American Statistical Association, 1991, 883. Accessed May 25, 2015.
Lamberti 4
This led to the removal of the market cap variable. None of the other variables needed to be
removed as the variance inflation factors were all below 10. With the remaining regressors,
multiple regression was performed using SAS Studio 3.2. Analysis included following a careful
procedure. The data was collected from Mergent’s database and Yahoo Finance. Yahoo Finance
calculated beta by comparing the monthly price change of a stock relative to the relative monthly
price change of S&P500 over the past 3 years.7 Of the 159 records on Mergent’s database, 64
were active. The final number of observations used in the analysis was 53 as 11 had missing
values and were not able to be used for analysis. Next, a multiple regression first order model
was proposed. This model had 7 regressors, but it was expected that this would have been
refined upon further analysis. Next, parameters were estimated throughout our analysis through
partial least squares regression while following the Guass-Markov theorem. Next, the errors
were estimated as normally distributed to allow for the approximation of the variance and
standard deviations. Following that, the F value and t values were considered for further
analysis. Next, the normality of the error distribution was established by residual visualization.
This was conducted for every proposed model. Lastly, for the best model, the overall regression
model and confidence intervals on parameter estimates were derived using the assumption of the
normally distributed errors.
The first order was refined by considering interactions. While all interactions were
considered, the model was not able to help explain more of the variation in the data. Even trying
backwards, forwards, and stepwise methods provided models that explained less of the variation
in the data or models that were not valid. We then checked each interaction individually and
included those that helped to explain more variation in the data. Additionally, we believed that
7 "Key Statistics Definitions." Yahoo Finance. Accessed May 26, 2015.
https://help.yahoo.com/kb/finance/SLN2347.html?impressions=true.
Lamberti 5
two higher order terms were needed. Therefore, the final model resulted in 7 first order model
terms, 3 interactions, and 2 higher order terms. Analysis of residuals and confidence intervals
provided additional support for the final model. We then concluded with an analysis on the
missing observations.
Results:
The variance inflation factor was compared to check for multicollinearity. The VIF
attempted to show those independent variables that were highly associated with one another.
Those that have values above 10 were considered variables that indicated the model had a severe
multicollinearity problem. The first VIF analysis was shown in Table 1. Since 2 variables have
VIF statistics above 10, we removed the highest one, Market Cap. After removing this
observation, the VIF analysis was run again. At this point, all of the VIF statistics were below
10. The results are provided in Table 2. Since all of the variables are less than 10, the rest of the
variables were retained. Therefore, with the remaining regressor variables, the following first
order model was
Table 1: Initial VIF Analysis
Variable VIF
Intercept 0
num_full_temp 9.08873
year_started 1.17284
long_term_debt 2.22526
Revenue 14.71564
Market_Cap 17.36815
EPS 3.68571
DPS 1.45508
PE 1.91887
Table 2: Final VIF Analysis
Variable VIF
Intercept 0
num_full_temp 7.02489
year_started 1.15978
long_term_debt 2.07018
Revenue 5.92337
EPS 3.13654
DPS 1.36319
PE 1.90276
Lamberti 6
𝑀𝑜𝑑𝑒𝑙 𝐼 = �̂� = 1.748 − 0.000036(𝑛𝑢𝑚_𝑓𝑢𝑙𝑙_𝑡𝑒𝑚𝑝) − 0.000989(𝑦𝑒𝑎𝑟_𝑠𝑡𝑎𝑟𝑡𝑒𝑑)
− 9.68𝑋10−12(𝑙𝑜𝑛𝑔_𝑡𝑒𝑟𝑚_𝑑𝑒𝑏𝑡) + 5.32𝑋10−9(𝑅𝑒𝑣𝑒𝑛𝑢𝑒) + 0.159(𝐸𝑃𝑆)
+ 0.165(𝐷𝑃𝑆) + 0.0201(𝑃𝐸)
The F value for this model had a value of 7.95 with a
corresponding p value of less than .0001. Therefore, the model
was significant. Additionally, the 𝑅2 and 𝑅𝑎2 had values of 0.553
and .4835 respectively. It should be noted that all variables had t
values that were significant at the .05 threshold value except the
year and long term debt variables. However, they were retained as
they helped to explain variation in the data. We found evidence
that a multiple regression analysis is appropriate for this data.
The residual plot versus the predicted plot of the data
suggested that a transformation was necessary. This was shown
in Figure 1. Attempting both a square root and a logarithm
transformation, the square root transformation appropriately
transformed the residuals. The residuals appeared randomly
distributed around 0 after this transformation. This was shown
in Figure 2. We also assessed the QQ Plot after the
transformation to confirm that the residuals have uniform
scedasicity. Overall, we found that the QQ Plot was fairly
linear. Therefore, the transformation appeared appropriate.
Next, possible interactions were explored. Trying all
possible interactions did not result in meaningful results.
Figure 1: The residuals versus the predicted value plot for the first order model.
Figure 2: The residual versus predicted plot of the model after the square root transformation.
Figure 3: The QQ Plot of the transformed model.
Lamberti 7
Additionally, trying backwards, forwards, and
stepwise methods for selecting variables did not
produce meaningful results because the models
we produced were biased, had lower 𝑅𝑎2, or were
not valid models. Therefore, we analyzed the
residuals to try and find possible interactions or
higher order terms that might be present. This was
provided in Figure 3. Since some pinching was
observed in the year variable, we believed that year
could be interacting with other variables. Therefore, we tested interactions with the other
variables individually to see if the results were meaningful. Additionally, we found that the DPS
variable had a curvature in the residuals suggesting that there could be a higher order term. This
was similar for the revenue variable, but is only evident when the plot is expanded.
We added each interaction term for the year variable with the others individually. Each
time we assessed the 𝑅𝑎2 to see if it helped to explain the variation in the data. In the end, the
year variable had meaningful interactions with the number of full time employees and revenue.
The other interactions with year did not help explain the variation in the data nor were
significant. All other interactions were considered individually as well. However, only an
interaction between long term debt and EPS were meaningful. Additionally, the higher order
terms for revenue and DPS were explored and were found to improve the 𝑅𝑎2. Therefore they
were retained as well. Thusly we resulted in a model that had an additional 3 interaction terms
and 2 higher order terms. The resulting model was
Figure 3: The residual plots of the 7 variables. The pinching was indicated on the 2nd plot of the first row for the year variable. The curvature was indicated on the 3rd plot of the 2nd row for DPS.
Lamberti 8
𝑀𝑜𝑑𝑒𝑙 𝐼𝐼 = √𝛽̂
= −0.763 + 0.000475(𝑛𝑢𝑚_𝑓𝑢𝑙𝑙_𝑡𝑒𝑚𝑝) + 0.000649(𝑦𝑒𝑎𝑟_𝑠𝑡𝑎𝑟𝑡𝑒𝑑)
+ 3.71𝑋10−12(𝑙𝑜𝑛𝑔_𝑡𝑒𝑟𝑚_𝑑𝑒𝑏𝑡) + 6.67𝑋10−8(𝑅𝑒𝑣𝑒𝑛𝑢𝑒) + 0.117(𝐸𝑃𝑆)
− 0.117(𝐷𝑃𝑆) + 0.011(𝑃𝐸) − 2.97X10−11(𝑦𝑛𝑢𝑚) − 2.97X10−11(𝑦𝑟)
− 4.205X10−12(𝑙𝑡𝑒𝑝𝑠) − 2.087X10−17(𝑟2) + 0.06(𝑑2)
Model II was significant with an F value of 8.46 and a corresponding p value of less than
.001. This model had a corresponding 𝑅2 and 𝑅𝑎2 values of 0.717 and 0.633 respectively. A
brief description of the model is provided in Table 3. Furthermore, with a Cp of 13, we have
evidence that the model produced is not biased.
The Cook’s D plot can be seen in Figure 4. We note that the third observation was
pulling the model towards itself heavily.
Therefore, we perform the Jackknife
procedure until we were satisfied that the
model was not being pulled significantly
towards a single observation. Therefore,
this was performed on observations 3 and 9.
Therefore, the new model produced was
Table 3: Model II summary with transformation,
interactions, and higher order terms
Source DF Sum of
Squares
Mean
Square
F
Value
P value
Model 12 3.21035 0.26753 8.46 <.0001
Error 40 1.26471 0.03162
Corrected
Total
52 4.47506
Figure 4: The Cook’s D plot of Model II. The 3rd observation has a very high Cook’s D statistic.
Lamberti 9
𝑀𝑜𝑑𝑒𝑙 𝐼𝐼𝐼 = √𝛽̂
= 0.5879 + 0.00049(𝑛𝑢𝑚_𝑓𝑢𝑙𝑙_𝑡𝑒𝑚𝑝) − 0.00001(𝑦𝑒𝑎𝑟_𝑠𝑡𝑎𝑟𝑡𝑒𝑑)
− 1.99𝑋10−12(𝑙𝑜𝑛𝑔_𝑡𝑒𝑟𝑚_𝑑𝑒𝑏𝑡) − 1.68𝑋10−8(𝑅𝑒𝑣𝑒𝑛𝑢𝑒) + 0.094(𝐸𝑃𝑆)
− 0.220(𝐷𝑃𝑆) + 0.0098(𝑃𝐸) − 2.65X10−7(𝑦𝑛𝑢𝑚) + 9.52X10−11(𝑦𝑟)
− 2.88X10−12(𝑙𝑡𝑒𝑝𝑠) − 1.11X10−16(𝑟2) + 0.0696(𝑑2)
Table 4: Model III summary with transformation, interactions,
and higher order terms after the Jackknife
Source DF Sum of
Squares
Mean
Square
F Value P value
Model 12 3.17563 0.26464 8.62 <.0001
Error 38 1.16675 0.03070
Corrected
Total
50 4.34238
Model III was significant with an F value
of 8.62 and a corresponding p value of less than
.001. This model had a corresponding 𝑅2 and 𝑅𝑎2
values of 0.731 and 0.647 respectively. A brief
description of the model was provided in Table 4.
Furthermore, with a Cp of 13, we have evidence
that the model produced is not biased.
A residual analysis was performed on the
final model. The distribution of the errors
appeared to be fairly normally distributed and centered at 0. There was minimal kurtosis present.
The residual plot showed that most of the observations were within 2 standard deviations of the
mean, and only a couple were beyond 2 standard deviations. The QQ Plot was fairly linear.
Figure 5: The residual plots for analysis of Model III.
Lamberti 10
This satisfied the 3 parts of the Gauss-Markov Theorem and the 4th assumption of the normally
distributed errors. The predicted versus actual value did fairly well at predicting the higher
values but was more scattered for those values less than .75. This was logical as the 𝑅2 was .73.
This was provided in Figure 5.
Discussion:
The final model, Model III, included 7 regressors, 3 interactions, and 2 higher order terms. As
stated previously, the final model was
𝑀𝑜𝑑𝑒𝑙 𝐼𝐼𝐼 = √𝛽̂
= 0.5879 + 0.00049(𝑛𝑢𝑚_𝑓𝑢𝑙𝑙_𝑡𝑒𝑚𝑝) − 0.00001(𝑦𝑒𝑎𝑟_𝑠𝑡𝑎𝑟𝑡𝑒𝑑)
− 1.99𝑋10−12(𝑙𝑜𝑛𝑔_𝑡𝑒𝑟𝑚_𝑑𝑒𝑏𝑡) − 1.68𝑋10−8(𝑅𝑒𝑣𝑒𝑛𝑢𝑒) + 0.094(𝐸𝑃𝑆)
− 0.220(𝐷𝑃𝑆) + 0.0098(𝑃𝐸) − 2.65X10−7(𝑦𝑛𝑢𝑚) + 9.52X10−11(𝑦𝑟)
− 2.88X10−12(𝑙𝑡𝑒𝑝𝑠) − 1.11X10−16(𝑟2) + 0.0696(𝑑2)
where ynum, yr, lteps were the interactions between years and number of employees, years and
revenue, and long term debt and EPS. R2 and d2 represented the higher order terms for revenue
squared and DPS squared.
Table 5: Parameter Estimates
Variable Parameter
Estimate
Standard
Error
Type I SS 95% Confidence Limits
Intercept 0.588 1.929 N/A -3.317 4.49
num_full_temp 0.00049 0.00026 0.533 -0.00003 0.001
year_started -0.000010 0.00095 0.0001 -0.00193 0.002
long_term_debt -1.9859E-12 6.92841E-12 0.072 -1.6012E-11 1.20399E-11
Revenue -1.67648E-7 1.712373E-7 1.197 -5.14299E-7 1.790042E-7
EPS 0.094 0.035 0.103 0.024 0.164
DPS -0.22 0.118 0.154 -0.458 0.018
PE 0.0098 0.00405 0.279 0.0016 0.018
Ynum -2.65166E-7 1.334934E-7 0.371 -5.35409E-7 5.077335E-9
Yr 9.51586E-11 8.99055E-11 0.033 -8.6846E-11 2.77163E-10
Lteps -2.8852E-12 2.34561E-12 0.099 -7.6337E-12 1.86319E-12
r2 -1.1093E-16 5.58336E-17 0.095 -2.2396E-16 2.10264E-18
d2 0.0696 0.0249 0.239 0.019 0.12
Lamberti 11
Model III had captured about 73% of the variation
within the model as indicated by the 𝑅2 of 0.731. With a root
MSE of about 0.175, this means that about 95% of the
observations will fall within 2(0.175), or .35, of the predicted
value. While the year variable and its interactions did not contribute the most to 𝑅2, they did
help partially. Additionally, the variable that had the biggest impact on the 𝑅2 was revenue and
its interaction. This was deduced by means of the Type I SS Error statistic. These figures were
summarized in Tables 5 and 6.
We then analyzed those observations with missing beta values. The three with the
missing values were ExxonMobil, Chevron, and ONE Gas, or observations 3, 9 and 60
respectively. The predicted values for each are -15, -2.1, and 0.699 respectively. The 95%
confidence intervals for a given beta value were -38 to 6, -7 to 2.5, and 0.32 to 1.07 respectively.
The 95% confidence intervals for the mean beta value were -38 to 6, -7 to 2.5, and 0.57 to 0.82
respectively. This was summarized in Table 7.
We note that it was obvious that ExxonMobil’s and Chevron’s betas were far off from
their observed values. We have these values as they were removed during the Jackknife
procedure. The observed beta values were 1.13 and 1.2 for ExxonMobil and Chevron
respectively. However, we were not entirely surprised as these firms are often cited as outliers in
the market. They were consistently noted as one of the best stocks to invest in. ExxonMobil was
Table 6: Important Figures
for Model III
Root MSE 0.175
Dependent Mean 0.817
𝑹𝟐 0.7313
𝑹𝒂𝟐 0.6465
Table 7: Estimates for Missing Values
Obs Predicted
Value
Std Error
Mean Predict
95% CL Mean 95% CL Predict
3 -15.8917 10.8608 -37.8782 6.0949 -37.8811 6.0978
9 -2.0756 2.2597 -6.6501 2.4990 -6.6639 2.5127
60 0.6988 0.0590 0.5794 0.8182 0.3245 1.0731
Lamberti 12
the largest publicly traded oil company. They were considered to have a world class
manufacturing efficiency and scale. They were highly recommendable for an investment
portfolio.8 Additionally, Chevron was the world’s 4th largest oil firm. They were also
considered one of the top stocks with above average dividend yields.9 As mentioned previously,
we performed the Jackknife on these two observations. The model was being pulled by these
outliers extremely strongly. Further, when the Jackknife was performed, it made the model more
accurate for typical observations but not as accurate for outliers. Therefore, we were not
surprised that these companies were unable to be estimated correctly by the model as they were
understood to be special and unique.
Conclusion:
The results of this analysis were somewhat comparable to Melicher’s work. We found
that very different measures were able to provide much more explanation than the variables used
in his analysis. Additionally, we found that the time in which a company was incorporated does
impact the beta value. Those that are older would have larger beta values than those that are
newer given that the other variables are constant. However, since the years a company interacts
with other variables, it was difficult to say with how precise our claim was. Specifically, the
interaction between years and the number of employees suggests that the beta values will be
smaller for new companies. The interaction term between year and revenue suggest that the beta
value will be larger for the newer companies. Nonetheless, the year a firm had been incorporated
was an important factor in helping to explain beta values. While we were able to produce a
useful model for the oil industry, we recognize that it may be problematic to apply this same
model to other industries. However, an analysis comparing how this model does in comparison
8 The Value Line Investment Survey. 38th ed. Vol. LXX. New York: Arnold Bernhard, 2015. 504-505. Print. 9 Ibid.
Lamberti 13
to models from other sectors or even other oil and natural gas industries in other countries would
be enlightening.
Appendix
/** ultima code for finance ind study **/
/** Import an XLSX file. **/
PROC IMPORT DATAFILE="/folders/myfolders/Finance/oil final.xlsx"
OUT=WORK.finance
DBMS=XLSX
REPLACE;
RUN;
/** Print the results. **/
PROC PRINT DATA=WORK.finance; RUN;
title "vif on all variables";
proc reg data= finance;
model beta = num_full_temp--pe / vif;
run;
title "vif on remaining variables";
proc reg data= finance;
model beta = num_full_temp--revenue EPS--pe / vif;
run;
/** vif complete **/
/** try interactions **/
data final;
set finance;
logy=log(beta);
sqrty=sqrt(beta);
asiny=ARSIN(beta);
lt2= long_term_debt*long_term_debt;
r2= revenue*revenue;
eps2= eps*eps;
pe2= pe*pe;
ynum=year_started*num_full_temp;
ylt=year_started*long_term_debt;
yr=year_started*revenue;
yeps=year_started*eps;
ydps=year_started*dps;
ype=year_started*pe;
ltr= long_term_debt*revenue;
lteps= long_term_debt*eps;
ltdps=long_term_debt*dps;
ltpe= long_term_debt*pe;
rpe= revenue*pe;
reps=revenue*eps;
rdps=revenue*dps;
rps=revenue*pe;
epsdps=eps*dps;
epspe=eps*pe;
dpspe=dps*pe;
ltrpe= long_term_debt*pe*revenue;
yepspe= year_started*eps*pe;
d2=dps*dps;
r2=revenue*revenue;
run;
/** trying transformations **/
title "Finance asin(y)";
proc reg data= final;
model asiny= num_full_temp--revenue EPS--pe;
run;
title "Finance Log(y)";
proc reg data= final;
model logy = num_full_temp--revenue EPS--pe;
run;
title "Finance Sqrt(y)";
proc reg data= final;
model sqrty= num_full_temp--revenue EPS--pe;
run;
/** selection techniques did not work try brute force**/
title "Finance Variable Selection";
proc reg data= final plots(label) = (cooksd RSTUDENTBYPREDICTED);
model sqrty= num_full_temp--revenue EPS--pe ynum yr lteps r2 d2;
run;
PROC IMPORT DATAFILE="/folders/myfolders/Finance/RE oil final.xlsx"
OUT=WORK.refinance
DBMS=XLSX
REPLACE;
RUN;
data refinal;
set refinance;
logy=log(beta);
sqrty=sqrt(beta);
asiny=ARSIN(beta);
lt2= long_term_debt*long_term_debt;
r2= revenue*revenue;
eps2= eps*eps;
pe2= pe*pe;
ynum=year_started*num_full_temp;
ylt=year_started*long_term_debt;
yr=year_started*revenue;
yeps=year_started*eps;
ydps=year_started*dps;
ype=year_started*pe;
ltr= long_term_debt*revenue;
lteps= long_term_debt*eps;
ltdps=long_term_debt*dps;
ltpe= long_term_debt*pe;
rpe= revenue*pe;
reps=revenue*eps;
rdps=revenue*dps;
rps=revenue*pe;
epsdps=eps*dps;
epspe=eps*pe;
dpspe=dps*pe;
ltrpe= long_term_debt*pe*revenue;
yepspe= year_started*eps*pe;
/**yltepspe=long_term_debt*yepspe;**/
/**ryepspe=revenue*year_started*eps*pe;**/
/**ltyr=long_term_debt*year_started*revenue;**/
/**ltrpe=long_term_debt*revenue*pe;**/
yy=year_started*year_started;
lt2=long_term_debt*long_term_debt;
d2=dps*dps;
r2=revenue*revenue;
/**yepsdps=year_started*eps*dps; doesnt help**/
ltepsy=lteps*year_started;
yepsper= yepspe*r2;
run;
title "Finance Removed";
proc reg data= refinal plots(label) = (cooksd RSTUDENTBYPREDICTED);
model sqrty= num_full_temp--revenue EPS--pe ynum yr lteps r2 d2/ cli
clb clm;
run;
title "GLM model";
proc glm data=refinal;
model sqrty= num_full_temp--revenue EPS--pe ynum yr lteps r2 d2;
run;
Works Cited
Chang, Wei-Chien, and Donald E. Weiss. "An Examination of the Time Series Properties of
Beta in the Market Model." Journal of the American Statistical Association, 1991, 883.
Accessed May 25, 2015.
"Key Statistics Definitions." Yahoo Finance. Accessed May 26, 2015.
https://help.yahoo.com/kb/finance/SLN2347.html?impressions=true.
Lintner, John. "The Valuation of Risk Assets and the Selection of Risky Investments in Stock
Portfolios and Capital Budgets." The Review of Economics and Statistics 47, no. 1
(1965): 13. Accessed May 23, 2015.
Melicher, Ronald W. "Financial Factors Which Influence Beta Variations within an
Homogeneous Industry Environment." The Journal of Financial and Quantitative
Analysis, 1974, 231. Accessed May 25, 2015.
Sharpe, William F. "Capital Asset Prices: A Theory of Market Equilibrium under Conditions of
Risk." The Journal of Finance XIX, no. 3 (1964): 425. Accessed May 23, 2015.
The Value Line Investment Survey. 38th ed. Vol. LXX. New York, New York: Arnold Bernhard
and, 2015. 504-505.