How to Build the ML.NET Repository

Have you wanted to contribute a bug fix or a new feature to the ML.NET repository? The first step is to pull down the repository from GitHub and get it built successfully so you can start making changes.

The ML.NET repository has great documentation. Part of it is how to build it locally from this doc. In this post, we'll go over the steps to do this so you can do the same and get started making changes to the ML.NET repository.

For a video version of this post, check below.

Fork the Repository

The first thing to do, if you haven't already, is to fork the ML.NET repository.

image (1).png

If you haven't forked the repository yet, you're good to go to the next step. However, for me, since I have already forked the repository a while back, I need to make sure I have the latest.

There are two ways to do sync up my fork with the main repository - running git commands or letting GitHub do it for you.

Syncing the Fork

We can run some git commands to sync up. GitHub has good documentation on how to do this for a more detailed explanation.

The first thing is to is to make sure you have an upstream remote set up to point to the main repository.

To check if you have it you can run the git remote -v command. If there is only an origin remote then you would need to add an upstream remote that points to the original repository.

image (2).png

If you don't have it set, this can be set with the following command.

git remote add upstream git@github.com:dotnet/machinelearning.git

Note that I have SSH set up so I use the SSH clone link. If you don't have this set up you can use the HTTPS link instead.

After setting the upstream remote, we need to get the latest from the

git fetch upstream

Once the upstream fetched we can merge those changes into our fork. Make sure you’re in the default branch and run this command to merge in the changes.

git merge upstream/main

Now you can start working on the latest code base.

Note here that I attempted to use GitHub to sync my fork. Unfortunately, it seems to not do as good of a job as the git commands.

Install Dependencies

Before we can start to build the code, there is a dependency we need to install. This dependency is included with a git submodule.

If you run the build before this step you will get errors, so it's best to do this before running the build.

To install the submodule dependencies, run the below command.

git submodule update --init

With the submodules installed we can now run the build through the command line.

Build on the Command Line

The build command is made very well in the ML.NET repository so there's very little you have to do to actually run it. We can run this on the command line. The script you run will depend on if you use Windows or Linux/Mac.

For Windows, you would run build.cmd and for Mac/Linux you would run build.sh.

The first time you run it will take a while. It needs to download several assets, such as NuGet packages and models for testing. After you download all of this, though, subsequent builds will go much faster.

Build in Visual Studio

With the main build now complete we can now build within Visual Studio. Although, currently, you may get an error in the Microsoft.ML.Samples.GPU project.

image (3).png

Why do we get this error in Visual Studio and not when we ran the build on the command line? It turns out that Visual Studio was set to have compile errors on warnings. There are a couple of things you can do to fix this.

First, since this is a samples project, the simplest thing is to just comment out the method. Instead of doing that, though, we can update the build properties of the project. We can either set the "treat warnings as errors" to "None".

image (4).png

Or, we can update the "Suppress warnings" to specify this specific warning. To get the warning we can go back and highlight the error with our cursor which will bring up a tooltip describing the error. It has a link to the CS0618 warning. We can put in the number in the "suppress warnings" section, 0618, and save the project.

image (5).png

Now we can fully build the solution in Visual Studio. Although, take note about this change when committing any other changes. You can either not include this change or include it and make a comment to discuss with the ML.NET team about it.


Hopefully, this post helps you get started to contribute to the ML.NET repository. If you make a contribution to the ML.NET repository, please let me know and we can celebrate!