In: Statistics and Probability
Yes, it is always possible to get a model that is the best one and has all the best properties. It is only because of assumptions that it holds.
This is likely the most important assumption. If your model doesn't fit this assumption then it'll be very obvious that you need to go to a different model. This is the assumption that the relationship between X and Y is approximately linear in shape when plotted. This assumption wouldn't hold if the line was curved in any way, but if all other assumptions hold true then one might be able to plot a different regression line such as with a parabola.
The errors in a model is the difference between the predicted linear regression line and the actual locations of each individual point that we're plotting. This assumption hold that the errors are not correlated and are instead independent of one another. This doesn't hold true if there's a correlation to the errors, which could be related to clusters of points instead of a line of distribution.
The Normality assumption is that the errors follow a roughly Normal distribution with a mean of 0. This is to say that while there are outliers, the majority of the errors within the model can be found pretty near the line of regression. If we overlay a mini normal curve at different points along the linear regression line we can see the amount of distance we would expect most points to be. This is contrasted by the possibility of a uniform distribution of any kind in which all points are about the same distance from the line or if the distribution looked more like a rectangle or square.
This is the megaphone or cone assumption. The Homoscedasticity of Errors is stating that the errors along the line of regression have a consistent standard deviation. Usually this assumption will fair when as X and Y increase along the graph, the deviation of points grows larger as we move up the linear regression line. Thus leading to the points in the shape of a cone. As if megaphone is calling out to be payed attention to how dramatically this assumption is being broken. For MLR this is simply expanded for all lines or planes. If the linear regression version of this assumption focuses on the relationship of Xi, then the MLR focuses on the relationship of all Xi.
This is stating that the independent variables are not correlated. While this is skill likely to happen it is a good practice to avoid having too many Independent variables being correlated to one another as even in the best of situation it's likely that only one would be needed when performing a regression model. If you imagine all of the possible information that we could predict as a hot dog and each independent variable as a little hotdog eating creature, let us say that feature A takes a bit of the hot dog and is thus able to predict a portion of the data. If Feature A is heavily correlated to feature B then feature B can only take about as much of a bit out of the hotdog as feature A, however it was only ever able to take the same bit as feature A and is thus without hot dog. This is all to say that even if you can perform MLR with correlated features, it's kinda silly and pointless to do so.
Based on these assumptions that SLR/MLR hold, we can say that it's always possible to say that it gives the best model and have all the best properties.