Select Page

Predict credit card approval using jupyter notebook, sklearn, and Postman.

A common misconception about doing data science or machine learning in the cloud is complexity and difficulty. While it is true that intimate knowledge of cloud technology and networking is required if you’re going to deploy something into production, nothing is saying that we can’t play around with the technology and deploy proof of concept projects to demonstrate what’s possible.

In this tutorial, we will play around with Microsoft’s Azure Machine Learning platform and train a simple binary classification model to predict whether it will get approved for a credit card or not. We will then take that model and deploy it in the cloud as a REST endpoint. We will do all this using a Jupyter notebook that is hosted in the cloud and we will also use Postman to simulate the prediction process.

WARNING! This is an epic post. Lots of pictures abound. I tried as hard as I could to capture each step of the way with a screenshot to give a sense of security for newbies out there like me!

In this post, we will:

  1. Sign up for Azure Machine Learning Studio
  2. Create a machine learning workspace
  3. Request for an increase in quota
  4. Create a compute instance
  5. Download a dataset from Kaggle
  6. Create a dataset in Azure Machine Learning Studio
  7. Play around in Jupyter notebook and predict the future!

1. Sign up for Azure Machine Learning Studio

To start, let’s go to https://azure.microsoft.com/en-us/free/ and click on the big green button that says “Start free” as shown below:

Screenshot by the Ednalyn C. De Dios

It will ask you to sign in to your Microsoft account (sign up for one if you don’t have one already).

Screenshot by the Ednalyn C. De Dios

Then it will prompt you for details to put on your Microsoft Azure profile.

Screenshot by the Ednalyn C. De Dios
Screenshot by the Ednalyn C. De Dios

Next, it will ask you to verify who you are by using your phone to text you a verification code.

Screenshot by the Ednalyn C. De Dios
Screenshot by the Ednalyn C. De Dios

Then, here comes the dreaded risk of putting your credit card information. Go ahead; fortune favors the bold… but don’t blame me if anything goes awry.

Screenshot by the Ednalyn C. De Dios
Screenshot by the Ednalyn C. De Dios

Finally, we’re ready to start.

Screenshot by the Ednalyn C. De Dios

2. Create a machine learning workspace

Go to https://portal.azure.com and on the homepage, click on “Create a resource”

Screenshot by the Ednalyn C. De Dios

Select “AI + Machine Learning” under the Categories panel,

Screenshot by the Ednalyn C. De Dios

or simply type in “machine learning” on the search box that says “Search services and marketplace” as shown below:

Screenshot by the Ednalyn C. De Dios

Then, on the machine learning card/tile that appears, click on “Create.”

Screenshot by the Ednalyn C. De Dios

Select the subscription that you would like to use. The default is usually something like “Azure subscription 1.” In this case, I renamed my subscription to ECDEDIOS-DEV.

For the resource group, click on “Create new” and type in a name for your project.

Screenshot by the Ednalyn C. De Dios
Screenshot by the Ednalyn C. De Dios

Next, fill in the workspace details…

Screenshot by the Ednalyn C. De Dios
Screenshot by the Ednalyn C. De Dios
Screenshot by the Ednalyn C. De Dios

Below, you can see the details of the machine learning workspace.

Next, scroll down.

Screenshot by the Ednalyn C. De Dios

Click on the “Launch studio” button to go to the Azure Machine Learning Studio.

ClickScreenshot by the Ednalyn C. De Dios

3. Request for an increase in quota

In Azure Machine Learning Studio, if not already visible, click on the hamburger icon (three horizontal lines) on the top left of the page to expand the left blade or sidebar.

Screenshot by the Ednalyn C. De Dios

Click on “Compute”

Screenshot by the Ednalyn C. De Dios

Click on “New” to attempt creating a new compute instance.

Screenshot by the Ednalyn C. De Dios

On the right blade or sidebar that appears, fill in the details of your compute.

Screenshot by the Ednalyn C. De Dios

When selecting the size of the virtual machine and nothing seems to be available, you might need to request a quota increase first.

Click on the text that says “Click here to view and request quota,”

Screenshot by the Ednalyn C. De Dios

Click on “Request quota.”

Screenshot by the Ednalyn C. De Dios

Fill in the details of your support request.

Screenshot by the Ednalyn C. De Dios

Don’t forget to click on “Enter details” to fill in the rest of the support ticket information.

Screenshot by the Ednalyn C. De Dios
Screenshot by the Ednalyn C. De Dios
Screenshot by the Ednalyn C. De Dios
Screenshot by the Ednalyn C. De Dios
Screenshot by the Ednalyn C. De Dios
Screenshot by the Ednalyn C. De Dios

When done, you’ll see a pop-up window on the top right of the page that says “New Support Request.” It will display the support request number for your ticket.

Screenshot by the Ednalyn C. De Dios

You should get an email that looks similar to the one below.

Screenshot by the Ednalyn C. De Dios

If approved, you will get a notification that your quotas have increased. This should take only a few hours but it really depends on the quota team’s workload.

Screenshot by the Ednalyn C. De Dios

4. Create a compute instance

Once approved, you can continue creating the compute instance.

Screenshot by the Ednalyn C. De Dios

In this case, I chose the Standard_DS11_v2 because it’s the cheapest one at $0.18 per hour.

Screenshot by the Ednalyn C. De Dios

It will take a few moments for the compute instance to be provisioned.

Screenshot by the Ednalyn C. De Dios

5. Download a dataset from Kaggle

While we’re waiting for the compute instance to be provisioned, let’s head on over to Kaggle for some clean credit card dataset.

Click on the “Download” button to download the dataset.

Screenshot by the Ednalyn C. De Dios

6. Create a dataset in Azure Machine Learning Studio

Next, let’s backtrack a little bit by clicking on the name of your workspace (creditcardproject).

Screenshot by the Ednalyn C. De Dios

Let’s expand the left blade/sidebar by clicking on the hamburger icon again.

Screenshot by the Ednalyn C. De Dios

And, click on “Datasets” on the left blade/sidebar.

Screenshot by the Ednalyn C. De Dios

On the next screen, click on the “Create dataset” button.

Screenshot by the Ednalyn C. De Dios

Select “From local files.”

Screenshot by the Ednalyn C. De Dios

Fill in the details and click on “Next.”

Screenshot by the Ednalyn C. De Dios

On the drop-down that appears, select “Browse files” and navigate to the place where you downloaded the clean dataset from Kaggle.

Screenshot by the Ednalyn C. De Dios

Click “Next” after uploading the dataset.

Screenshot by the Ednalyn C. De Dios

Doble check that the contents look correct. Click “Next” when you’re done.

Screenshot by the Ednalyn C. De Dios

On the next screen, you’ll get a chance to specify the data type of some, all, or none of the columns.

Screenshot by the Ednalyn C. De Dios

Pay special attention to the column and their corresponding data type. Below, I changed integer to string for the ZipCode column. Click “Next” when you’re done examining and correcting data types.

Screenshot by the Ednalyn C. De Dios

Examine the details of the dataset on the next screen and click “Create” to register the clean credit card data set as an Azure Machine Learning Studio data set that can be used for runs et cetera.

Screenshot by the Ednalyn C. De Dios

After creating the dataset, click on its name so we can explore its properties and get the details needed to use it for our experiments.

Screenshot by the Ednalyn C. De Dios

Notice that Azure Machine Learning Studio keeps track of the version number of the dataset.

It can also profile the dataset to get the feel of its contents.

Click on “Consume” when you’re done.

Screenshot by the Ednalyn C. De Dios

Here on the “Consume” tab, we can get code snippets that we can use to refer to the dataset.

Screenshot by the Ednalyn C. De Dios

On the “Explore” tab, you can see the actual contents of the dataset.

Screenshot by the Ednalyn C. De Dios

7. Play around in Jupyter notebook and predict the future!

Now it’s time for the real fun to begin!

In the compute screen, find the compute instance that you want to use and click on “Jupyter” under Applications.

Screenshot by the Ednalyn C. De Dios

Acknowledge the warning that pops up about trusted code and click “Continue” when done.

Screenshot by the Ednalyn C. De Dios

Navigate to the Users folder as well as your username.

Screenshot by the Ednalyn C. De Dios

On the top right of the Jupyter notebook interface, click on the “New” button and Python 3.8 AzureML.

Screenshot by the Ednalyn C. De Dios

Let’s give the notebook a proper name.

Screenshot by the Ednalyn C. De Dios
Screenshot by the Ednalyn C. De Dios

Let’s code!

Copy and paste the following into a cell in Jupyter.

https://medium.com/media/91c52948e07ba2f98c54522f9a827938

Above, we’re simply importing the needed packages.

https://medium.com/media/a2b054a41ead55806a3afa1f4895c673

Next, we’re simply configuring Jupyter’s output to not truncate anything and display everything.

https://medium.com/media/d81e01d9851d3ccbfaea6d4c79ec56f2

Above, we’re supplying the details of our subscription, resource group, workspace, and dataset that we’re going to use.

https://medium.com/media/4f59b549c8df5ff1c9482d302e393259

Here, we’re excluding attributes that may be considered discriminatory and dropping them. We then assigned the resulting DataFrame to a new one.

https://medium.com/media/12182a235482ebd82b3c6d3e4dc75ef1

Next, we’ll create our X and y variables with X as our feature (independent) variables and y as our target (dependent) variable.

https://medium.com/media/5ac58595affc76ed19901afbe9f590cd

Then, we’ll normalize our data with sklearn’s MinMaxScaler.

https://medium.com/media/2c1c64d0433d1adca517ee30f66c54a1

Then, we’ll connect to our workspace, create an experiment called “credit-card-project” and set up MLflow.

https://medium.com/media/b3a18a3d83701a6c734e74935709d002

Here, we are splitting our dataset to create training and testing sets.

https://medium.com/media/76b495898464a203a653a5f7cab0820a

Now, we’re just setting up LogisticRegression as our classifier and training it.

We should get something that looks like this:

Screenshot by the Ednalyn C. De Dios

Let’s go back to the Azure Machine Learning Studio UI and navigate to “Models.” The model list should be blank for now because we haven’t registered any models yet.

Let’s change that.

Screenshot by the Ednalyn C. De Dios
https://medium.com/media/6d3e8fb20a3c5eb6f3dfbfd4ac33db03

On the cell above, we’re simply registering the model and naming it “credit_card_model.” If you refresh the Model List, you should now see the model.

Screenshot by the Ednalyn C. De Dios

Let’s set up the environment.

https://medium.com/media/5ea097f19a4680bc0caeb0f996c682f6

Above, we’re creating an environment for our deployment. In this case, we’re using an Ubuntu image that is already loaded with Python 3.8 and sklearn.

Next, we’re just configuring some of the deployment details like the number of CPU cores, memory, and description.

https://medium.com/media/2308b892439e1b02bb58e7278e33b2a7

Above is where we are actually deploying the model so that we can use it for inferencing.

https://medium.com/media/6e5938284603834fcf32414730ec14c6

In the above cell, we’re simply preparing the sample data and headers that will go into the request.

https://medium.com/media/3275bf9cccdf237b148325c3e708fe74

The magic actually happens in the cell above. Here, we are making a POST request to the URI of the model that we deployed earlier.

Let’s see the result:

https://medium.com/media/e332aa18cca9c159aa2812636e35ff92

Above, we’re printing the REST endpoint, the actual label, and the prediction. Here’s what the output should look like:

Screenshot by Ednalyn C. De Dios

You can check out the whole notebook here.

Bonus Round

In this section, we’re going to demonstrate using Postman to simulate the prediction process using POST.

Screenshot by Ednalyn C. De Dios

First, let’s make sure that we are making a POST request and not a GET request. Second, let’s put the URI of the endpoint (service.uri). Third, click on the “Body” tab, select “raw,” and select “JSON.” Fourth, put the body of the request in the textbox as shown above.

And finally, click on the “Send” button.

Screenshot by Ednalyn C. De Dios

Approved!