**Introduction**

My project will try to answer the question as follows: Is college worth the cost? Which universities have the most earning graduates? And what’s the best 5-year income predictor?

**Motivation**

I came across The Data and Storage Library (DASL) as I worked on this project and looked at distinct data sources. The website contains information from 1996, but the latest information gathered in 2018 is used by this project. The data provide several possible predictors and background information appropriate for the issues to be analyzed for the construction of regression models.

Finding the correlation between average college expenditure and average earnings after five years of graduation would be interesting.

**Background**

There have been many studies showing investment in the form of income after graduation, the more a university costs to attend. This hypothesis may not be accurate, however, shows fresh research and common opinion on the present instructional market.

With new market changes like; the rise of internet instructional programs, the upward trend of many learners with student loans, there is an urgent need for empirical assessment of cost impacts on eventual achievement. University’s total cost is not the best way to show of the benefits after graduation.

While a university’s expense proved to be a major factor in determining salaries when examined on its own, when other educational variables such as average college spending was not a big issue. When other demographic and socio-economic variables such as ethnicity and average household earnings were included in the equation, it became even more negligible.

The test series demonstrates that learners expecting to enter college can avoid making enormous investments on the premise that their investment will be more rewarding than the quantity invested in school and the anticipated result. Therefore, the investment should be produced on the basis of larger factors, such as university funds for their learners.

Competitive school’s elevated price shows prestige which generates the illusion of higher chance. There is an illusion of chance, expenditure, and prestige. With moment, it becomes apparent that prestige is no longer appealing to many learners.

Of course, the concept that college education is crucial for a good life is supported by both statistical and social proof. There is little sign, however, as to whether a there is benefit in attending college.

My project seeks to explore whether this is still true for producing extremely competitive graduates, taking into account the present social influences on academic expenses and the increasing culture of competition among both ranked and unranked universities. My paper is not against attending college rather, there are cheaper alternatives to get the same benefits of attending college.

**Data**

I gathered my information from The Data and Storage Library (DASL) as stated above. New.time.com obtained their information.

My project analyzes the mean student revenue at 706 separate universities and colleges in the United States five years after graduation. This involves private and public universities.

Using this information, a straightforward regression of the average student attendance price per year should show any connection between increasing educational costs and subsequently increasing income. This simple regression, however, will not constitute an precise model because of the absence of account of other variables that may influence graduate students ‘ mean revenue.

A distinct multiple regression against different factors mentioned in the table below should generate a more accurate model of mean revenue, providing more data on the effect of college expense on the eventual revenue of a student.

Variables such as average student revenue, average student population SAT scores, assisted merit, need a fraction and whether a university has been included in public or private because they can intuitively be anticipated to show the quality of education offered by a university, either individually or together. This can be an indication of how good learners per college are going to be.

Name |
Description |
Units |
Type |
||||||||||

The mean income of students 5 years | |||||||||||||

i-earn | after graduation | US Dollars | Dependent | ||||||||||

Dummy variable; whether | |||||||||||||

Public | Boolean | Independent | |||||||||||

or not a university is private or | |||||||||||||

public | |||||||||||||

Price | Average cost of attending | Log of US | Independent | ||||||||||

institution with no aid | Dollars | ||||||||||||

p-paid | The average cost of attending the | Log of US | Independent | ||||||||||

respective institution with aid | Dollars | ||||||||||||

Sat | The average SAT score of accepted | SAT Score | Independent | ||||||||||

students at the respective institution | |||||||||||||

n_frac | Fraction of students who are aid | Percentage of | |||||||||||

need-based aided | student class | Independent | |||||||||||

m_aided | Fraction of students who are merit | Percentage of | |||||||||||

aided | student class | Independent | |||||||||||

Statistics |
|||||||||||||

Standard |
|||||||||||||

Variable |
Observations |
Mean |
Deviation |
Minimum |
Maximum |
||||||||

45597.7 | 6723.932 | 28300 | 79700 | ||||||||||

i-earn | 706 | ||||||||||||

0.379603 | 0.485632 | ||||||||||||

public | 706 | 0 | 1 | ||||||||||

42199.79 | 15726.77 | 16500 | 70400 | ||||||||||

price | 706 | ||||||||||||

23491.36 | 8270.414 | 2200 | 49200 | ||||||||||

p-aid | 706 | ||||||||||||

Sat | 706 | 1141.778 | 136.572 | 810 | 1550 | ||||||||

n_frac | 0.572242 | 0.171056 | 0.07 | 1 | |||||||||

706 | |||||||||||||

m_aided | 0.156667 | 0.106342 | 0.01 | 0.55 | |||||||||

706 |

**Statistics on all the variables used in the project**

**Model**

**Regression Models**

l_earn = βo + δ1 public + β1 * (price)+u

l_earn = βo + δ1 public + β1 * (price) + β 2* (p-aid) + β3 * (sat) + β4* (n-merit) +β5 * (n-frac) +u

**Model Assumptions**

The Gauss Markov Assumptions must be met for degree of reliability of a linear regression be met.

**Linearity of Model:**

The two models use the standard form Y = β+ β* x+ β* x+ …, confirms linearity assumption.

0 1 2

**Random sampling:**

Since the Data and Storage Library (DASL) has kept data collected for almost every university in connection with the hypothesis of my project, the data creates random variables for what my project wants to examine. Some organizations, however, retain their documents for purposes of privacy. That’s recognized. The institutions ‘ choice to withhold their information, however, differs arbitrarily.

**Zero Conditional Mean:**

True, my two regressions won’t meet zero conditional mean assumption as some important variables are not included in the regressions. Some of the variables are race, college acceptance rate, institution-spent educational expenses per student, and the proportion of learners earning a degree at the institution within the four years anticipated.

Due to the correlation of my used factors and omitted variables, which are also significant explanatory variables of mean revenue in the dataset, bias will occur in this model.

**No Perfect Collinearity:**

This model will be biased owing to correlations between my used factors and omitted variables that are also significant explaining factors of mean revenue in the dataset. Some of the variables are race variables in the college acceptance rate, the institution’s educational expenditure per student and the proportion of learners who receive a degree within the four years anticipated

*Correlation data between my variables*.

**5.** **Homoscedasticity:**

The residual variance in the models used in this document is growing rapidly but stays fairly continuous as seen in the residual graphs below. This conduct should approximate this hypothesis carefully enough to minimize the bias produced by this assumption being broken.

**Results and Interpretation**

**Simple Regression**

The simple regression analyzes a student’s five years average in come after registration against the average participation price per student.

l_earn= 40420.52 + .1226824* (p-cost) + u

(695.1075) (.0154362) R 2: 0.0823 Adjusted R 2: 0.0810

The R squared indicates the price is not explaining the income on its own. The price has a favorable coefficient, however, which suggests a general correlation between the two factors?

**Multiple Regression**

The multiple regression eliminate bias and raise the meaningfulness of the coefficients.

l_earn = 22595.85+ 7687.722 * public + 0.2964476 * p-cost -.0312917* p-aid

(3080.8) | (1057.121) | (.0405251) | (.0524663) |

+ 10.60806 * sat_avg_all -0.6890.43* n-frac + 145.1307 *m-aided+u | |||

(2.246219) | (1656.338) | (2216.526) | R 2: 0.3802 Adjusted R 2: 0.3743 |

Biasness has been eliminated by the higher Adjusted R-squared in the multiple regression model. Higher score in SAT, merit aided students tends to have higher income. While school attended, have a positive correlation on earnings, need-based student’s decreases earnings as per the regression.

**Multiple Regression**

Taking multiple regression logs with even accurate adds to our explanatory regression factors to assist eliminate bias more with robust standard errors and increase coefficients ‘ meaningfulness.

l_earn = 6.009592 + .2108225* public + 0.3158855* ln price -.0312917* ln p-aid

(0.3704127) | (0.21019) | (0.307892) | (0.022842) |

+ 0.2304095 * ln sat -0.090365* ln n-frac + 0.0039056*ln m-aided+u | |||

(0.0543525) | (0.0152217) | (0.005085) |

** Regression results for Natural Log of mean Income**

Variable |
Coefficient |
Standard Error |

Public | 0.2108225** | (0.0226622) |

Price | 0.3158855** | (0 .0323973) |

Sat | 0.2304095* | (0.0493987) |

Price with aid | -0.0381008 | (0.0239273) |

Need fraction | -0.090365 | (124.25) |

Merit aided | 0.0039056 | (0.005099) |

vConstant | 6.009592 | (0.3378209) |

N | 706 | |

R2 | 0.4045 |

*p < 0.10; **p < 0.05; ***p < 0.01 Standard errors presented in parentheses

- F testing

Both cost and public were only important in the log multiple regression at a two-tailed value of 10 percent. However, based on prior research, the two were anticipated to be important at all levels.

I performed F-test for the joint significance level. Unrestricted R2 was 0.4045 for the log model. On the other hand, restricted simple model for R 2 was 0.3802. Considering 5 degrees of freedom value of 700 the F value was 2.1750, thus the variables are not significant at 5%.

**IV.** **Conclusion**

An institution’s expense appears to have little impact on the income of learners after graduating from the institution for five years. While cost is important in the simple regression, when many variables are added in my multiple regression, it becomes irrelevant.

Furthermore, considering the university’s government or private nature did not lead in a shared importance. Thus, my theory is proved by the first various regression.

There is a correlation between the cost of the institution and the possible income, but the correlation is weak. This is probable due to the greater general educational expenditure at greater cost organization.

From my model, there is not a extremely important marginal return on college education. Therefore, ignoring the prestige of attending elite colleges and choosing a cheaper institution over a more costly institution may create more sense for a student.

**References**

- Fox, M. (1993). Is it a good investment to attend an elite private college?
*Economics of Education*(2), 137-151. Retrieved November 22, 2018.*Review, 12* - Pascarella, E., Smart, J., & Smylie, M. (1992). College tuition costs and early career socioeconomic achievement: do you get what you pay for?
(3), 275-290*High Educ,24*

## Cite This Work

To export a reference to this article please select a referencing style below: