Open source projects: Automated compilation of license information FOSS

In the context of projects that contain free and open source software (also: "FOSS"), various requirements must be met. One of these is to compile license information and copyright details and distribute them together with the software. How can this be done as automatically as possible?

In the context of projects that contain free and open source software (also: "FOSS"), various requirements must be met. One of these is to compile license information and copyright details and distribute them together with the software. How can this be done as automatically as possible?

An important requirement of almost all license conditions from the open source sector is to distribute the text of the license conditions and the information on the authors together with the FOSS software. An obligation to name the author also arises by law in the German legal area from § 13 UrhG.

In theory, the requirement is easy to implement: simply copy the license terms and the copyright notice. In practice, however, it should be borne in mind that even the simplest projects often involve a very large amount of open source software. In addition, sometimes only the object code of the open source software is directly available. The source code is freely accessible. However, this would first have to be searched for, and in the correct version. The sheer number of open source software packages to be considered can ultimately mean that compiling the information is an enormous effort.

Even once it has been compiled, updates will be necessary, as your own project is not static and other or updated open source libraries may soon be included.

Example: Even in a Raspbian Lite environment that is to be used embedded on a Raspberry Pi, there are quickly around 500 open source packages. Each of these packages must be checked individually and the license information and copyright details must be compiled for the purposes of this analysis. And this has to be done every time a new version of the Raspbian image is used (or at least the differences have to be determined and the information has to be updated).

If software is even distributed via Docker images or another containerization solution, manual compilation is no longer an option.

Solution via simple scripts

One solution to this problem may be to rely on the compilation of the package maintainers, which generally already store license and author information in the open source packages.

It should be emphasized that errors can be made by the package maintainers and therefore the "official" information in the open source package may be incorrect. It should be noted that even compiling the information for an individual open source package can be difficult, as in some cases every source code file has to be checked for license and copyright information. Even if this is done properly, it is conceivable that the authors themselves may have forgotten information and it may therefore be necessary to check with them. After all, if an author does not name a license, no license has been granted in case of doubt (or only implicitly and therefore to an unclear extent).

However, assuming that the information in the open source packages is correct, a script can usually be written quickly to compile the necessary license information - a task that can take many man-days to complete manually.

For environments such as Debian or Raspbian, an extremely simple script (which is deliberately kept straightforward for illustrative purposes) could be structured as follows, for example:

This script is for illustrative purposes only and is not intended as a substitute for customization.

A script like the one above compiles the information stored in the open source package with the license conditions and copyright information for each existing package. And this is completely automated.

In the case of automation by such a script, which must be specially developed for the respective application, precise manual checks should be carried out. For example, the above script lacks the simplest control mechanisms to check whether no information could be found for a package.

In addition, consideration should be given to whether other software may be present that is not covered by the stored package information. In the context of an embedded system, this could apply to the Linux kernel, for example, which is often not installed as a package and may be located on a different partition or at least in a different file system. Nevertheless, the necessary information must also be compiled for the Linux kernel.

Compliance with further requirements regarding open source license conditions

A script - as illustrated above - can quickly help to save several man-days of manual work that would have to be repeated for the smallest changes to the software compilation.

It should be emphasized, however, that the usual open source license conditions result in various other requirements that must be observed for license-compliant use. These include, for example, clarification of whether and, if so, how exactly the source code for each open source component used must or can be made available.

In addition, the copyleft effect, which exists with some open source license conditions, should generally be avoided with commercial applications. Otherwise, the entire proprietary software can be published free of charge.

Furthermore, a distinction must generally be made between open source components that are essentially only included "incidentally" and those open source components that are directly necessary for the execution of your own "main software".

For an overview of further requirements in the area of open source license conditions, see here.

Date: 30. Jun 2021