This piece is the latest in a series, called “Machine Learning Is Not Magic,” covering how to get started in machine learning, using familiar tools such as Excel, Python, Jupyter Notebooks and cloud services from Azure and Amazon Web Services. Check back here each Friday for future installments.
When you are getting started with machine learning algorithms, it’s a great idea to learn the formula through Excel. It will give you a thorough understanding of the concept behind the algorithm. But to evolve repeatable Machine Learning models that work with new data points, we have to use mature frameworks and tools. Once you get familiar with the concepts, you can start utilizing higher level libraries like NumPy and Scikit-learn in Python. In the upcoming parts of this tutorial, I will walk you through the process of configuring and using Python with the same use case based on Stack Overflow salary calculator.
In the last installment of this tutorial, I introduced the concept of linear regression through Microsoft Excel. We used the LINEST function to validate our assumptions, and also used it to predict salary for values falling outside of the limits of the original dataset.
In this part, we will understand how to streamline linear regression for accuracy and precision. In that process, we will explore the “learning” component of machine learning.
Now that we have the basic understanding of Linear Regression, let’s take a closer look at the learning part of Machine Learning.
If a tool like Microsoft Excel can do ML, what’s all the hype and buzz about? If ML is just about applying the right algorithm to data, where is the learning and training aspect? Let’s try to get an answer for that.
Remember, when compared to the actual salary, our prediction was plus or minus $100. Though this may not matter much in simple scenarios, the difference can vary widely in complex datasets making the predictions inaccurate and almost useless.
Read the entire article at The New Stack