Multiple Linear Regressions by Maximizing the Likelihood under Assumption of Generalized Gauss-Laplace Distribution of the Error

Comput Math Methods Med. 2016:2016:8578156. doi: 10.1155/2016/8578156. Epub 2016 Dec 7.

Abstract

Multiple linear regression analysis is widely used to link an outcome with predictors for better understanding of the behaviour of the outcome of interest. Usually, under the assumption that the errors follow a normal distribution, the coefficients of the model are estimated by minimizing the sum of squared deviations. A new approach based on maximum likelihood estimation is proposed for finding the coefficients on linear models with two predictors without any constrictive assumptions on the distribution of the errors. The algorithm was developed, implemented, and tested as proof-of-concept using fourteen sets of compounds by investigating the link between activity/property (as outcome) and structural feature information incorporated by molecular descriptors (as predictors). The results on real data demonstrated that in all investigated cases the power of the error is significantly different by the convenient value of two when the Gauss-Laplace distribution was used to relax the constrictive assumption of the normal distribution of the error. Therefore, the Gauss-Laplace distribution of the error could not be rejected while the hypothesis that the power of the error from Gauss-Laplace distribution is normal distributed also failed to be rejected.

MeSH terms

  • Algorithms
  • Computer Simulation
  • Data Interpretation, Statistical*
  • Databases, Factual
  • Ecology / methods
  • Hormones / chemistry
  • Likelihood Functions*
  • Linear Models*
  • Models, Chemical
  • Models, Statistical
  • Organic Chemicals / chemistry
  • Regression Analysis
  • Reproducibility of Results

Substances

  • Hormones
  • Organic Chemicals