Intro to LeVar 0.1 – Elias Ponvert – Hello Austin Data Meetup!



Intro to LeVar 0.1 – Elias Ponvert – Hello Austin Data Meetup!

0 0


LeVar-Intro-Slides


On Github eponvert / LeVar-Intro-Slides

Intro to LeVar 0.1

A database to make ML evaluation sane

Elias Ponvert

director of data science at

Hello Austin Data Meetup!

What this talk is about

  • Machine learning evaluation is a pain, and in my experience we're all terrible at it
  • Here's what we should do instead
  • Say hello to LeVar, a database designed to help us do what we should do

Some more about LeVar

  • Demo!
  • A bit about how it's implemented, if you're interested
  • Open issues and next steps

Machine learning evaluation is a pain and we're all terrible at it

Quick show of hands, does anybody disagree with this statement?

Evaluation is important

  • Evaluate early, evaluate often
  • Keep a lab notebook
  • Evaluate every change
  • Do error analysis
  • Change your model based on evidence

But what happens

  • No standard source, storage or format for eval
  • End up rewriting our evaluation scripts for each project
  • Often couple our evaluation code with our core ML code
  • Error analysis is a pain
  • Hard to track results on the same evaluation over time
  • Harder to to comparative error analysis over time

Our solution

We can do better

  • Keep evaluation datasets in a centralized location
  • Evaluations are largely immutable
  • Not tied to any one ML framework (e.g. scikit)
  • ...or big data framework (e.g. RDDs)
  • Totally agnostic WRT to method
  • Human-readable datasets
  • Use simple and common format for data exchange
  • Open schema for data points, i.e. arbitrary features

We can do better

  • Command-line tool for data import & power users
  • Several standard problem schema supported
    • Classification
    • Regression
    • Geo-prediction
    • Structure prediction
    • Machine translation
  • Several standard useful evaluation criteria supported
    • Accuracy
    • Precision/recall/F-score
    • ROC
    • RMSE...

We can do better

  • Web UI showing high-level experiment results suitable for bosses or clients
  • Web UI & CLI search for error analysis
  • User can comment on anything:
    • Dataset
    • Experiment
    • Item in dataset
    • Individual prediction in experiment
    • Another comment
  • User can label anything

We can do better

  • Sensible information architecture
    • Organize datasets into groups or organizations
  • Provide sensible baseline evaluations out-of-the-box
    • Most-common class
    • Mean value
    • (Weighted) random

We can do better

  • REST API to use with your favorite framework
  • Straightforward client libraries
  • Export and import to other formats (yo RDD)

Introducing LeVar

It does several of those things!

But don't freak it's just v0.1

Here's what's done now

  • CLI (in Scala)
  • Import/export TSV
  • Organizations data model
  • API for everything, using basic auth
  • Client code for Scala
  • Classification & regression datasets
  • Many useful evaluations

LeVar data model

  • Datasets
  • Items (datum)
  • Experiments
  • Predictions

Everything has a creation date; everything has a UUID

The data model should support most of those other features, so I can already do e.g. error analysis by dropping into psql and writing a query

On Github

https://github.com/peoplepattern/LeVar

Demo?

Future directions

Let's just jump over to the issues page

https://github.com/peoplepattern/LeVar/issues

THANKS!

Intro to LeVar 0.1 A database to make ML evaluation sane Elias Ponvert director of data science at