Submissions

There are two types of submissions. First is the nightly submission of predictions from the team, if they choose to. Second is the code submission for the top 5 teams where they are reviewed for readability and reproducibility as we finalize the top 3 winning teams. In both cases the submissions must be made via the environment dedicate for the team, so setting up your environment is important.

Accessing the Compute Environment

Once all the team members have successfully signed the NDA , each team will have a dedicated AWS Machine to run execute their code. For being able to use the AWS environment each team member will be provided with a separate credentials that they will use to connect with environments that is assigned to each team.

DO NOT SHARE YOUR CREDENTIALS ( AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY ) WITH ANYONE.

Each team will have one common Jupyter workspace that they will share so as to build their models. You’ll receive an email once everyone on your team has signed the NDA and registered with instructions and the credentials. Please save the following commands into text file or stick note to keep them handy. (Replacing XXXXXXX with actual values from the email)

Once you have all the information kindly use the links provided in the email or below to install python3 , pip3 , aws-cli and aws ssm client on your local machine.

On Mac or Linux terminals:

export AWS_ACCESS_KEY_ID=XXXXXX
export AWS_SECRET_ACCESS_KEY=XXXXXXXX
aws ssm start-session --target i-XXXXXXXXX--document-name AWS-StartPortForwardingSession --parameters '{"portNumber":["9443"], "localPortNumber":["9443"]}'

On Windows:

set AWS_ACCESS_KEY_ID=XXXXXX
set AWS_SECRET_ACCESS_KEY=XXXXXXXX
aws ssm start-session --target i-XXXXXXXXX--document-name AWS-StartPortForwardingSession --parameters '{\"portNumber\":[\"9443\"], \"localPortNumber\":[\"9443\"]}'

After successful execution of the above command you will have following output “Waiting for connections…

Next, in your browser go to: https://localhost:9443

When you first encounter the page, you might get prompted with a warning message “Your connection is not private”. This is by default for most modern browsers and nothing to worry about. If you are using Chrome, you can type thisisunsafe on the keyboard and it will allow you to proceed and not prompt you again. Other browsers would have something similar to allow you to proceed.

You’ll be prompted by JupyterHub’s login page. Since the environment is shared amongst the team members, the username and passwords are:

  • username : jovyan
  • password : jupyter

Now you should be able to create a new Notebook (PySpark of Python3) depending on your competition. If you wish to install custom python packages, you can open the terminal in Jupyter and use pip3 install.

Daily Prediction Submission

Each team is invited to make a daily submission of their predictions using the evaluation dataset. Every night an automated job will take the latest submission (if present), run the scoring and populate the leader board on the TracHack website the next morning.

A team can make a submission for the day by placing their predictions as a CSV file named with the submission date in the YYYY-MM-DD format inside the submission folder within their environment. For example the file submission/2021-04-10.csv will contain the team’s submissions from 10th of April 2021. While teams are not required to submit every day, regular submissions are very useful to get feedback on how the teams are progressing.

TracHack 21.1: For TracHack21.1 the CSV must comprise of comma separated values with header line_id and one_time_redeemer. It should consist of one single entry for all the line_id’s from evaluation dataset and respective predicated ‘one_time_redeemer’ flag with either 1 (yes) or 0 (no).

line_id,one_time_redeemer 
a908d0df-a352-4e54-b4f8-17ed2678beb6,1
9a9c868f-15c0-4eb7-8c5b-409c8fed2bc3,0
2a8e1fb0-0647-46ae-b229-0b8c3ac94651,0
136ebaf4-b521-4ece-b4fd-51f4c4a7e9e8,1
... and so on

TracHack 21.2: For TracHack21.2 the CSV must comprise of comma separated values with header line_id and upgrade. It should consist of one single entry for all the line_id’s from evaluation dataset and respective predicated ‘upgrade’ flag with either 1 (yes) or 0 (no).

line_id,upgrade
a908d0df-a352-4e54-b4f8-17ed2678beb6,1
9a9c868f-15c0-4eb7-8c5b-409c8fed2bc3,0
2a8e1fb0-0647-46ae-b229-0b8c3ac94651,0
136ebaf4-b521-4ece-b4fd-51f4c4a7e9e8,1
... and so on

NOTE: You must make a prediction for ALL line_ids in the eval dataset for the submission.

Final Code Submission

When it is time to get ready for the final submission for TracHack, there are three things you should do:

1. Submit your predictions as a submission CSV files for the date format:
a. TracHack 21.1 teams will produce a S3 submission for the day: yyyy-mm-dd/*.csv
b. TracHack 21.2 teams will produce a S3 file for the day: yyyy-mm-dd.csv

2. Create a folder called ‘code’ in your Jupyter workspace. Similar to the daily submission folder.

3. Consolidate all of your code (data prep, feature selection, model training and prediction, etc) into a single jupyter notebook and call it mlcode.ipynb. Save this inside the code folder you create in step 2. It is VERY important that your code reproduces the submission. We will use that notebook to reproduce your submission predictions. If these don’t match, then your submission is not valid.