项目作者: srowen

项目描述 :
Modeling Lifecycle with ACME Occupancy Detection and Cloudera
高级语言: Scala
项目地址: git://github.com/srowen/cdsw-simple-serving.git
创建时间: 2017-03-15T16:41:48Z
项目社区:https://github.com/srowen/cdsw-simple-serving

开源协议:Apache License 2.0

下载


Modeling Lifecycle with ACME Occupancy Detection and Cloudera

Data science is more than just modeling. The complete data science lifecycle also includes data
engineering and model deployment. This project offers a simplified yet credible example of
all three elements, as implemented using Apache Spark, the
Cloudera Data Science Workbench,
and JPMML / OpenScoring.

In this project, the ACME corporation is productionizing a connected-house platform. Part of this
service requires predicting the occupancy of a room given sensor readings.

This example project includes simplified examples of:

  • Data Engineering
    • Ingest
    • Cleaning
  • Data Science
    • Modeling
    • Tuning and evaluation
  • Model Serving
    • Model management
    • Testing
    • REST API

Requirements

Get Started

To continue, review documentation for each of the three modules, which contains more information
about what it show and how to run it.

Build Status