Forecast Microsoft stock prices in Azure Machine Learning Studio’s AutoML.
I’ve said it before and I’ll say it again:
A common misconception when it comes to doing data science or machine learning in the cloud is complexity and difficulty. While it is true that intimate knowledge of cloud technology and networking is required if you’re going to deploy something into production, there’s nothing saying that we can’t play around with the technology and deploy proof of concept projects to demonstrate what’s possible.¹
In this tutorial, we will play around with Microsoft’s Azure Machine Learning platform and train a simple time-series forecasting model to predict the daily closing price of Microsoft stock (MSFT). We will then take that model and deploy it in the cloud as a REST endpoint. Then, we’ll continue to test the REST endpoint by making more predictions for the future.
WARNING! This is an epic post. Lots of pictures abound. I really tried as hard as I could to capture each step of the way with a screenshot to give a sense of security for newbies out there like me!
Basic knowledge of Azure Machine Learning Studio. Visit this post for a quick and dirty tutorial on how to Train and Deploy a Binary Classification Model in Azure Machine Learning.
In this post, we will:
- Create a new resource group
- Create a new machine learning workspace
- Create a new AutoML job
- Provide training dataset
- Create a new Azure compute cluster
- Provide a test dataset
- Train a time-series forecasting model
- Deploy the best model to a web service
- Test the web service
0. The Data
Before we get started with Azure, let’s check out the dataset first. Go to https://github.com/ecdedios/azure-automl-time-series-forecasting and download both train and test datasets.
These datasets came from Yahoo Finance. It contains daily historical data of Microsoft (MSFT) stock for the last two years. There are seven columns: Date, Open, High, Low, Close, Adj Close, and Volume. The training dataset consists of data from 5/26/2020 to 12/28/2021. The test dataset consists of data from 12/29/2021 to 5/23/2022. Notice that the test dataset begins where the training dataset ended. When it comes to time series, we do not split the data randomly. For more on this, visit this post by Dario Radečić.
1. Create a new resource group
According to Microsoft, “A resource group is a container that holds related resources for an Azure solution. The resource group can include all the resources for the solution, or only those resources that you want to manage as a group.”²
Simply put, resource groups hold all the related resources that work together to accomplish a certain goal.
Let’s start our journey together by going to the Azure portal at https://portal.azure.com.
Click on “Resource groups.”
Then, click on “Create.”
Select a subscription you’d like to use and type in a name for your resource group. Also, select a region. I recommended a region that is close to you.
Click on “Review + create.”
You’ll see it validating. Once finished, click on “Create.”
2. Create a new machine learning workspace
Microsoft states that a “workspace is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. The workspace keeps a history of all training runs, including logs, metrics, output, and a snapshot of your scripts.”³
In other words, it’s just like a resource group that is created specifically for machine learning. The machine learning workspace is your gateway to the Azure Machine Learning Studio, where you orchestrate the different services needed to do machine learning tasks.
Back to the Azure portal, click on the resource group that you just created.
Click on “Create” and a dropdown will appear.
Select “Azure Machine Learning.”
Click on the “Create” button.
Select the subscription and resource group that you’d like to use. Type in a name for your workspace and select a region. The rest of the fields will auto-populate.
Click on the “Review + create” button.
It will go to another round of validation. Once complete, click on the “Create” button.
Deployment of the workspace is not instant; you may have to wait just a little bit. When done, click on “Go to resource.”
Let’s now leave the Azure portal and go into the Azure Machine Learning Studio. Click “Launch studio.”
3. Create a new AutoML job
Microsoft hails AutoML as “the process of automating the time-consuming, iterative tasks of machine learning model development. It allows data scientists, analysts, and developers to build ML models with high scale, efficiency, and productivity all while sustaining model quality.”⁴
Long story short, it lets data scientists of all skill levels build, deploy, and operationalize machine learning models at scale.
Click on “Start now.”
Click on “New Automated ML job.”
Click on “Create” and select “From local files” in the dropdown that appears.
Name your dataset and type in a short Description.
4. Provide training dataset
Click on the “Browse” button and select “Browse files” in the dropdown that appears.
Select the training set that you downloaded earlier from Github.
Confirm that the details of the file are correct and then click on the “Next” button.
It will take about a few seconds for your newly uploaded dataset to appear. Click on “Refresh” to see it on the list.
Once it appears on the list, select it by putting a blue and white check mark on the left of its name.
5. Create a new Azure compute cluster
Now we are back to the Automated ML job creation screen/panel.
Configure the job by typing in an experiment name and selecting your target variable. The target variable is the variable that you want to predict.
Select the compute type to use. If you don’t have one already, click on “New.”
Select the region for the compute cluster, select the cheapest VM, leave the rest of the defaults, and click on the “Next” button.
Type in a name for your compute cluster, change the maximum number of nodes to 2, leave the rest of the defaults, and click on “Create.”
Now, we’re back to the Automated ML job creation screen/panel again.
Confirm that the details are correct and then click on the “Next” button.
Now it’s time to provide the test set for validation.
Click on the dropdown as shown below and select “Provide a test data set.”
Click on “Create” and select “From local files” in the dropdown that appears.
Just like before, provide the details of the dataset and click “Next.”
6. Provide a test dataset
Click on the “Browse” button and select “Browse files” to begin the upload.
Select the test set.
Confirm that the file details are correct, and then, click “Next.”
Change any properties if it’s not already autodetected.
Make sure the dataset that you just uploaded is the selected test data set.
7. Train a time-series forecasting model
More than likely, creating the Automated ML job will take a while.
The job is finished once the status says “Completed.”
For this case, we received a warning about stopping early.
We’re going to ignore it for this case.
Now, click on the “Models” tab so we can see which model performed the best.
8. Deploy the best model to a web service
There will be a lot of algorithms that appear. We want to focus on the one where there’s a “View explanation” value in the Explained column.
Click on that algorithm’s name.
On the right of the “Deploy” menu, there’s a dropdown arrow. Click it.
Select “Deploy to web service.”
A panel will appear on the right. Type in a few details like endpoint name and description. For the compute type, select “Azure Container Instance.”
We should get a notification that the model deployment is successfully triggered.
Let’s check out the endpoints.
If you don’t see the sidebar on the left, just click on the hamburger icon (three horizontal lines) and click on “Endpoints.”
Click on the endpoint name and let’s see if it’s ready.
Be sure to wait a while until the deployment is completed. At times, the deployment state might show as “Failed” or “Unhealthy” but the only way to make sure what the real result is to wait until it shows in the notification that “Endpoint xxx deployment completed” as demonstrated in the screenshots below:
When the deployment state says “Healthy,” it means we’re ready to proceed.
Scroll down on the endpoint’s details tab and look for “REST endpoint.”
Take note of the web address. This is your service uri. You will need to consume the model using web requests. But for now, let’s just test the endpoint from the Azure Machine Learning Studio UI.
9. Test the web service
Navigate to the “Test” tab.
Change a few fields. These will your input data from which the model will use to make a prediction.
Click on the “Test” button.
We have successfully made a forecast.
That’s it for now.
In just a few minutes, we were able to train and deploy a time-series forecasting model just from the UI of Azure Machine Learning Studio. Imagine, what it can do, if we use the Python SDK!
Maybe in the next post.