Example : Use Scikit-Learn, PySpark ML Models in Java Using MLeap

Introduction:

Many of the most popular machine learning frameworks are based in python. The other fact is that java has been around for quite some time as preferred language for backend development. One way could be to expose ml models as APIs. Downside being need to manage another service and extra calls over network which could have been saved.

So, the question: How do we use scikit, pyspark based models in java?

Steps:

This example uses mleap to demonstrate how to load the ml model. We will first write all the steps involving using ml models trained by scikit-learn or pyspark in java.

  1. Use scikit-learn or pyspark to export the ml models using mleap(for example: Logistic Regression or Random Forrest) using mleap. I will write some other post to show how to export a ml model. Refer this, a nice example demonstrating the export of ml model.
  2. We will load the data using the scala interface provided by mleap. Since both scala and java works on JVM, we can call scala methods in java.

Step 1: The Data and the model

Taking the example given here, we will download the model generated by it from here.

The model is logistic regression done on the airbnb data. Download the data from here.  The data contains following information about airbnb accommodations:

[‘id’, ‘name’, ‘price’, ‘bedrooms’, ‘bathrooms’, ‘room_type’, ‘square_feet’, ‘host_is_superhost’, ‘state’, ‘cancellation_policy’, ‘security_deposit’, ‘cleaning_fee’, ‘extra_people’, ‘number_of_reviews’, ‘price_per_bedroom’, ‘review_scores_rating’, ‘instant_bookable’]

Here, we have extracted features like bedrooms, bathrooms, square_feet etc. We then applied logistic regression to get relation between the features and price. And later exported the model using mleap. We will download the model generated by it from here.

Step 2: Loading the ml model in scala

A scala code demonstrating the loading of model and running the sample test.

Running this code will give the following output:

Price LR: 232.62463916840318


All codes are inspired from mleap documenation.

Refer example of scikit model here.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at WordPress.com.

Up ↑

%d bloggers like this: