Welcome to numerflow’s documentation!

This project provides fully automated data workflows for the numer.ai machine learning competition. The following tasks are currently implemented:

  • fetch and extract the datasets
  • train and predict
  • automatic upload

… as well as a task (workflow) that implements a pipeline from fetching the training data, to training the model and finally submitting the predictions.

Contents:

Getting Started

Fetch the latest release from GitHub. Install the dependencies via pip install -r requirements.txt. You also need to create API credentials on the numer.ai website under the link account. The following permissions are needed:

  • Upload submissions.
  • View historical submission info.
  • View user info, (e.g. balance, withdrawal history)

You can run the example pipeline called Workflow directly:

env PYTHONPATH='.' luigi --local-scheduler --module workflow Workflow --secret="YOURSECRET" --public-id="YOURPUBLICID"

This should fetch the most recent data, train a LogisticRegression model and submit the predictions. When you want to roll your own model and preprocessing, start with tasks.numerai_train_and_predict.TrainAndPredict.

Indices and tables