Nowadays almost every programming project uses dependencies, often open source. Each of these dependencies has its own license. If you launch your project you need to make sure all licenses are okay to use for you.
In this article, I will show you how to set up a fully automatic step in your build pipeline for Python projects.
In an earlier project of mine, I showed how 8000 Python packages do not have GNU GPL but do depend on a package using GNU GPL. This might* be fine for the package itself but if you use this package in enterprise software, you just created a product that uses GNU GPL.
*This is not tested in a court as far as I know, and every court of every country, state, county, federation, etc. could rule differently. Consult a lawyer if you need legal advice, not a random article on the internet.
This continues on 8000+ python packages might have to change to GNU Public License
To avoid this I will show 2 different ways of testing this; testing just your installed environment after building, and testing your full docker image before shipping. I will show it with GitHub actions and Jenkins.
General setup
We are going to use the Python package license-scanner. This package will look at the pyroject.toml
for whitelisted licenses and packages. It works by finding ALL packages installed in the Python environments and checking if the license or the package is whitelisted. If one or more packages do not pass this test an error will be raised.
Below is an example, it contains a common list of licenses that can be used in commercial software.
To test this locally, navigate to the folder of
pyroject.toml
and run license_scanner -m whitelist
.
Github action: Check dependencies of my package
When you build a package you want to do this step automatically, but you also only want to test the dependency and not build or test tools. To avoid it, install the dependencies using pip install -r requirements.txt or via pip install .. Now run the license scanner and after that install other tools like pytest or build if needed.
Jenkins: check a full docker image
If you are deploying a docker image as a product, it is better to test the entire docker image. To avoid cluttering the docker with test code we create a test docker based on the production docker we want.
Create a file named Dockerfilelicensescan and populate it as below. Note how the image we build from is variable.
To your build pipeline (Jenkins in this example) we add the build step and a test step. In the test step, we build the docker test image based on the docker production image. After that, we just run the container. If an exception is raised the build pipeline will fail.
Conclusion
It is relatively simple to set up an automatic way to check your Python licenses. The hard part is to find out what licenses you can use within your project or to refactor your code if a dependency uses a license you want to avoid.