Welcome to numerflow’s documentation!¶
This project provides fully automated data workflows for the numer.ai machine learning competition. The following tasks are currently implemented:
- fetch and extract the datasets
- train and predict
- automatic upload
… as well as a task (workflow
) that implements a pipeline from fetching the training data,
to training the model and finally submitting the predictions.
Getting Started¶
Fetch the latest release from GitHub. Install the dependencies via pip install -r
requirements.txt
. You also need to create API credentials on the numer.ai website under
the link account. The following permissions are needed:
- Upload submissions.
- View historical submission info.
- View user info, (e.g. balance, withdrawal history)
You can run the example pipeline called Workflow
directly:
env PYTHONPATH='.' luigi --local-scheduler --module workflow Workflow --secret="YOURSECRET" --public-id="YOURPUBLICID"
This should fetch the most recent data, train a LogisticRegression
model and submit
the predictions.
When you want to roll your own model and preprocessing, start with
tasks.numerai_train_and_predict.TrainAndPredict
.