Emacsen's Blog

Is the AGPL Broken?

Introduction

Just over a year ago, Chris Webber gave a talk at CopyleftConf about how the AGPL is incompatible with a style of computing.

If you want to read the slides, they're at: https://dustycloud.org/misc/boundaries-on-network-copyleft.pdf

Sadly there hasn't been much discussion about it since, so I'm going to throw my hat into this rodeo- or some metaphor to that effect.

Before we wrestle with bulls, let's talk about the goal of the AGPL and why it's important in the Free Software ecosystem.

As most people reading this probably already know, the GNU GPL is a license that says that if you have a program, you're entitled to use it, copy it and modify it and that if you distribute it to others, you must do so under the same terms that you received it. It's "Share and Share Alike"

But what does this mean when we have applications that run remotely, such as web applications where executing the program means executing code on someone else's computer? The AGPL states that if you release a program under the AGPL and make it available to others that they have the same obligation to release it to others, whether you release the program as a binary or make it accessible for execution over a network.

This is a good thing in my opinion. Running a program in a networked way to get around the GPL is an anti-social thing to do.

With that out of the way, let's dive in.

A simple program

Let's first begin with the idea of a program where state is captured inside execution, rather than in variables. If you know what a closure is, then you can skim or skip this part.

If you don't know what a closure is, you might be wondering what the heck I'm talking about, but it's really not that hard to imagine. Let's take an example from Chris's own work

Chris wrote their code in Scheme. I think the use of a Lisp can lead people to come to the conclusion that this is somehow a Lisp related issue, so I'm going to write my code in Python in order to show that the issue is universal.

Chris proposes that some programs may contain private data but at the same time be stateless. This was hard for me to wrap my head around at first, but we can write a program like this fairly easily:

   def make_greeter(greeter_name):
        return lambda guest_name: print(f"Hi {guest_name}, I'm {greeter_name}!")

With this, we can construct a greeter named Alice

    alice = make_greeter("Alice")
    alice("Bob)

And we'd get back "Hi Bob, I'm Alice". What's important here is that the alice function doesn't maintain state. The "Aliceness" is constructed at the time the function is defined.

The data in this case is actually the "Bob" string and not the "Alice" string. The "Alice" string is part of the alice function's executable code.

It's a nifty trick, but it has some deeper implications.

Turning our program into a service

Imagine that instead of being generated on the Python shell, there was some external database, and instead of just being a name, the function also contained private information.

Let's rewrite our program with that in mind. We'll create a database of people and their favorite colors.

   db = {
        'alice': 'red',
        'bob': 'blue'}

    def make_person(name, color):
        return lambda guest_name: print(f"Hi {guesprogramming model.t_name}, I'm {name} and I like {color}")

    people = [make_person(*record) for record in db.items()]

Remember, our secrets aren't contained within our database- they're contained within the functions themselves. While this example is trivial, we're starting to see how this could become interesting.

Let's up the ante a bit by turning this into a network application.

    from flask import Flask, abort, request
    app = Flask(__name__)

    db = {
        'Alice': 'red',
        'Bob': 'blue'}

    def make_person(name, color):
        return lambda guest_name: f"Hi {guest_name}, I'm {name} and I like {color}.\n"

    people = {name: make_person(name, color) \
              for (name, color) in db.items()}

    @app.route('/<person>')
    def show_greeting(person):
        guest = request.args.get('guest')
        return people[person](guest)
        abort(404)

And run it:

    serge@laptop:~$ curl http://localhost:5000/Alice?guest=Bob
    Hi Bob, I'm Alice and I like red.

Nifty, but not especially different from the previous example, except as it applies to the AGPL.

We can take this example in one of two directions, both of which I believe breaks the AGPL.

The first is that we might imagine the database contains some other secrets, but that we're encoding these secrets as code. Let's imagine that we have a service that lets doctors and other services that we explicitly permit to have access to health-related data about us.

As privacy-oriented developers, we may want to self-host this application. I certainly feel better about running my own services, especially where sensitive/private data is concerned.

As far as the standard GPL is concerned, this is no problem. My private version of my application that only runs on my computer is entirely mine. But the AGPL is different- the network accessibility of the service places the program under the same distribution terms as we would have if we were to distribute the program.

Configuration as Code

How realistic is this scenario of using code for configuration? It's far more common than you might originally think. As Chris's talk points out, it's extremely common in Lisp to use this method- but it's not limited to Lisp by any means. Several popular Python web frameworks use a config.py file, and PHP developers use config.php.

This is because while the licenses do not pertain to running environments, these configuration systems turn the configuration "data" into running an executable. That is distinct from, for example, pulling data from a YAML or config.ini file because in a config.py file, the file is being interpreted as code and becoming part of the program itself.

This is largely a non-issue because in a vast majority of cases there is a distinction between the types of static variables placed inside a configuration file and the dynamic code that's inside the program files, but this doesn't have to be the case. It's possible to write configuration that contains executable code, and if that executable code modifies the behavior of the application itself, then it is indistinguishable from program code.

Does this mean you can't write a Python application that uses config.py or a PHP program that uses config.php under the AGPL? In most cases, the difference between simply storing a variable statically inside one file or another would not make a difference, but as the complexity of configuration may grow to include functionality, that line begins to blur, and while I'm not a lawyer, I believe that without relicensing the configuration files, the answer is that if your configuration is sufficiently complex that it is indistinguishable from code that you will need to publish it as code under the AGPL.

Obviously this is not the intent of the AGPL, and this specific scenario is easily remedied by separating out and separately licensing the config files, but this is a conscious action that the developer must take.

Plugins

Let's take on a more complex version of this problem: What happens when applications are not simply monolithic, stand-alone things, but when they include components that are external in some way?

Chris in a reddit reply to this post, mentions browsers- so let's use that as an example. If you're reading this, you most likely are doing so on a web browser. You're also likely to have one or more plugins. Plugins are application logic that extends the functionality of your application in some way. The plugins may be under a variety of licenses- anything from extremely permissive to entirely proprietary.

If your browser is under the GPL, the waters become very murky as it relates to the licensing requirements of plugins. Wordpress, the popular CMS and blogging platform, has stated that Wordpress plugins should, (or possibly must) be released under the GPL. That is because a plugin is not a stand-alone work. A plugin depends on the Wordpress application framework, and thus plugins are derived (or as GPLv3 calls it, "based on") the original program.

For GNU GPL applications, this is a bit of an oddity, as while Wordpress may require plugins be under the GPL, they cannot compel users running proprietary plugins to provide source code to them. With the AGPL, a network user of the program has the same rights as person downloading the program.

This is a lot to take in, but we're not quite done yet. In Spritely Goblins, the system Chris is developing, there is no distinction between a local program execution and one that runs on the network. While some developers may be used to thinking about remote procedure calls and remote APIs, the Goblins model makes this distinction largely invisible to the user and even the developer- program logic may be run locally, on a nearby server owned by the same person, or halfway around the world by someone, they've never met.

Goblins, by design, erases the distinction for a programmer about whether the code being run is internally or externally. It erases the distinction for a programmer about whether or not the code is being run at arm's length.

Under the GPL, this is no problem- network services are at arm's length and thus there's no problem with integrating your GPLed internal code with some external proprietary service. But under the AGPL, network services are explicitly included.

A brief review

...That was a lot to cover, so let's review briefly.

  • Some programs are going to be Free Software, but contain "proprietary parts" because they need to for privacy reasons.

  • Plugins that are written for an AGPLed system must be AGPLed, even if they operate across the network

  • Therefore we have an impedance mismatch between the intent of the AGPL (to protect Software Freedom) and personal privacy, which is amplified on a system that makes no distinction between local and network code

In the land of tomorrow...

Now that this is covered, let's get weird...

Spritely Goblins has the potential to do more than just provide remote procedure calls for remote applications- it's designed so that it could also take object code and safely execute it locally.

This may seem strange at first, but a longer-term goal of Spritely appears to be to take in-memory object code and ship it to another machine where it can be safely executed. I use the adjective "apparently" here because I don't see mention of this in the Spritely docs, but it is something Chris and I have discussed privately.

In terms of functionality, this is extremely powerful, but it gets complicated when we talk about source code requirements. As people who have done work in the field of Reproducible Builds know, making software reproducible is not trivial, and if instead of shipping object code, we had to ship source code around, this would be a large burden on the recipient system to then need to not only build the source but possibly also to replicate the remote environment.

misc