Algorithms
Machine learning can utilise diverse algorithms that automatically create rules depending on the input data received. There are many algorithm models to choose from that serve various purposes. However, selecting the algorithm that best suits a programmer's purpose is highly important.
Generalisation
Generalisation classifies the efficiency of a machine learning algorithm to identify or forecast unseen data. It's important to train the machine learning algorithm to recognise a wide range of data to increase its accuracy in interoperating and predicting data.
Let's imagine we need to train a machine learning model to classify between dogs and cats. If we only provide the model with two breeds of dogs, the model will have a low classification score when it's asked to recognise a variety of dog breeds. The classification score refers to how successful a program is at identifying a problem. Therefore, data diversity is a significant factor in improving the generalisation of a machine learning model.
Overfitting
Overfitting refers to a model that has been trained too efficiently to the extent that it negatively impacts understanding new data. This leads to the algorithm having a high variance in its predictions.
Variance refers to the algorithm's ability to over-analyse data. This means that the algorithm is accurate on average but is inconsistent overall. When training a model, it's important to find a balance between not enough information and too much information for the predictions to be accurate and consistent.
(FreeCodeCamp, 2020)
Fun fact
Concepts from linear algebra, calculus, probability theory, and statistics are essential for developing machine learning algorithms. These algorithms use mathematical equations and functions to identify patterns, make predictions, and classify information.