Skip to content
Akshay Utkarsh Sharma edited this page Mar 29, 2016 · 16 revisions

Spark-Transformers

Library for exporting spark models in Java ecosystem.

Goal of this library is to :

  • Provide a way to export Spark models/transformations into a custom format which can be imported back into a java object.
  • Provide a way to do model predictions in java ecosystem.

#Usage

Add jar to classpath

http://spark.apache.org/docs/latest/programming-guide.html#using-the-shell

./bin/spark-shell --master local --jars adapters-V1.0-SNAPSHOT-jar-with-dependencies.jar

Train, export in spark.

val sqlContext = new org.apache.spark.sql.SQLContext(sc)

import org.apache.spark.ml.classification.LogisticRegression

// Load training data
val training = sqlContext.read.format("libsvm").load("data/mllib/sample_libsvm_data.txt")

val lr = new LogisticRegression().setMaxIter(10).setRegParam(0.3).setElasticNetParam(0.8)

// Fit the model
val lrModel = lr.fit(training)

// Print the coefficients and intercept for logistic regression
println(s"Coefficients: ${lrModel.coefficients} Intercept: ${lrModel.intercept}")

import com.flipkart.fdp.ml.export.ModelExporter

//Export the trained model
val exportedModel = ModelExporter.export(lrModel, training)

//save exportedModel somewhere

Import and predict in java.

//Import and get Transformer
Transformer transformer = ModelImporter.importAndGetTransformer(exportedModel);

//predict
double predicted = (double) transformer.transform(new Double[] {0.3, 0.5. 0.6});

For detailed usage see unit tests. https://github.com/flipkart-incubator/spark-transformers/blob/master/adapters/src/test/java/com/flipkart/fdp/ml/adapter/BucketizerBridgeTest.java

Getting help

For help regarding usage, drop an email to fdp-ml-dev@flipkart.com

Clone this wiki locally