- API designs must make sense from the point of view of a third-party app developer (start by designing high-level APIs, only add daemons if it is necessary)
- Interfaces that don't have to be API should not be API (minimize surface area)
- Use existing frameworks where we can; if we can't use them directly, learn from their design
- Identify privilege boundaries, do not trust less-privileged components, and consider whether some features should be restricted
Minimize "surface area"
The "SDK API" is intended to remain stable/compatible over time, which means that we are committing to interfaces in the "SDK API" continuing to work in future: third-party code that uses stable Apertis APIs must continue to work in future, without needing changes.
As a result, one of the most important questions to ask about new public interfaces is: does this need to be public API right now? If it doesn't, then it can be private, at least to begin with. We can change private APIs to be public later if they turn out to be necessary, but changing public APIs to be private would be a compatibility break.
Initially omitting interfaces from the public API means fewer things whose stability we need to guarantee, which means fewer constraints on how we improve the platform in future. Conversely, if we put too many things in the public API (guarantee too much) too early, we'll probably regret it in future.
Some examples of applying that principle:
- hiding struct contents by using a MyObjectPrivate struct instead of putting members in the MyObject struct (in GObject, use G_ADD_PRIVATE() or G_DEFINE_TYPE_WITH_PRIVATE())
- considering the D-Bus API between a built-in or otherwise special app, and the system components it uses, to be private
- if in doubt, making things private initially, and making them public if it later proves to be necessary
Have as few daemons as possible, but no fewer
Sometimes it's necessary to have more than one process (app code talking to a daemon/service/agent, typically a D-Bus service). There are lots of good reasons to do that:
- having a privilege boundary
- mediating between multiple processes that all want to manipulate the state of the same object (for instance, Barkway decides the order of the popup stack, which is global shared state)
- having something persist in the background when switching between apps or closing/reopening apps (for instance, Telepathy puts telephony, instant messaging and other real-time communications in the background)
However, every time we introduce inter-process communication between two components, we increase the complexity of the system, which increases the cost (time) of building and maintaining it.
As a result, if we don't need the extra process because none of the reasons above apply, we should try not to have it: it's "cheaper" to use a shared library that gets loaded into the app process. For instance, Grilo is just a library, and doesn't have an associated daemon. Similarly, if there's an opportunity to reduce the number of layers of "a daemon talking to a daemon talking to a daemon", we should probably consider it.
Another consideration is whether daemons with similar privileges, performance characteristics and lifetimes can be merged together: for instance, as of May 2015 we have Barkway and Mutter as separate components, whereas GNOME Shell puts notifications and the window manager (among other things) in the same process. If the requirements allow it, we should do the same.
Be aware of where the privilege boundaries are
Not all of the code in Apertis is equally-privileged: components that run as root are more privileged than components that run as the user, and components with a permissive AppArmor profile are more privileged than components with a restrictive profile.
Whenever two components with different privilege levels communicate, we should be aware of where the boundary is, and what should be allowed. If the more privileged component trusts the less privileged component too much, then we have privilege escalation - the less privileged component can effectively get access to the more privileged component's privileges, by controlling the more privileged component and requesting that it carries out whatever action is being restricted, with the result that the privilege boundary doesn't actually provide any security.
When crossing a privilege boundary, the question to ask is: if the less privileged side is being actively malicious, what's the worst that can happen? For instance, the less privileged side might be a malicious third-party app (malware), or it might be a legitimate component which an attacker has been able to compromise via a security vulnerability.
One important example is where the less privileged component communicates with a service via a D-Bus API. If the method call has a username or app_name parameter, a malicious or compromised app could use a different user's name, or a different app's name: what would happen then?
We can avoid problems in situations like this by using information that comes from a trusted component (the Linux kernel, systemd or dbus-daemon) and cannot be faked. For instance, the GetConnectionCredentials D-Bus method tells you the user ID and AppArmor profile of a process on the bus: this information comes from dbus-daemon, and can be trusted. If we can derive the app name from the AppArmor profile, then that cannot be faked either, so we can reliably identify an app by its profile.
Start from the API that the app developer will use
This is related to the surface area and fewer daemons reasoning above. The requirements for our SDK APIs take the form "third-party apps should be able to do X, Y and Z". In many cases, it is tempting to address this by providing a D-Bus API that the third-party app can use, with auto-generated GDBus C "bindings".
In Collabora's experience, auto-generated code has limits: it's often the quickest way to get from nothing to a minimum acceptable API, but it usually produces somewhat weird APIs, which could be easier to use if they were designed as C APIs from the beginning. When we started developing telepathy-glib, we mistakenly thought that most of it could be generated code. However, over time, we realised that there was quite a low limit to its quality and usability if we stuck to that idea: the current approach has much more focus on high-level C APIs, with the D-Bus API designed separately in support of the desired C API.
It often results in better (nicer to use) APIs if the starting point for the design is the GObject-style C API that we intend third-party applications to use, in the form of a library logically arranged into appropriate objects: we could even consider starting with stub implementations that work with "mock" results, and filling in the real implementation afterwards (for instance a mock version of a contacts database might return a list of hard-coded contacts).
That C API might have to evolve over time as we fill in the implementation, but if the general outline can stay intact, it's a good indication that the result is going to be something that apps can use.
Think about how much we want the app to be allowed to do
This is not normally a concern, but in Apertis it is, because there's a privilege boundary between apps (whether our apps or third-party apps). We need to consider (and document) which of these categories each feature is in:
- all apps (including third-party ones) should be able to do this
- apps should be able to do this if they have some special flag in their manifest
- only trusted components within Apertis should be able to do this
For instance, in Android, all apps can write to /sdcard; only apps that have asked for the "permission" can record audio; and only trusted system components can install/uninstall apps.
If two parts of an API are in different categories, or if two parts of an API should be in the second category and have different flags (be usable by different apps) that's probably a sign that there needs to be a split between those APIs.
For instance, looking at Frampton and Tinwell as of May 2015, recording probably needs to be locked-down more than playback: the worst case for playback is that the driver is annoyed and turns down or mutes the volume, whereas the worst case for recording is sending private conversations to the Internet. So recording and playback should have a clear division between them.
Follow GNOME conventions
Our platform includes a lot of GNOME-related libraries such as GLib and Clutter, and our API guidelines follow GObject/GNOME style quite closely. Application developers will find it easier to use our APIs if we are consistent about following GObject conventions.
Many of the developers working on these components are not necessarily familiar with GObject conventions. If in doubt, it might be helpful to ask a developer with more GObject experience, or look at how GIO or GTK+ 3 does similar things.
See the Coding Conventions on this wiki for more on this topic, but here is a brief summary:
- namespace objects with the library's appropriate prefix
- use GLib naming conventions: CapitalizedWords for types; CAPITALS_AND_UNDERSCORES for constants; lowercase_and_underscores for parameters, struct members and variables
- avoid Hungarian notation (pSomePointer, v_func_returning_void(), etc.)
- use GError to report recoverable runtime errors
- use g_return_[val_]if_fail() to report failed precondition checks and other library-user errors
- use GObject features where appropriate: signals, properties, construct-time properties, virtual methods (vfuncs)
- use GIO features where appropriate: GAsyncResult, GCancellable, GTask
- prefer to use GAsyncResult instead of your own function typedef for asynchronous operations
- prefer to use signals instead of your own function typedef for events
Use existing software, or if we can't, learn from it
Several components in Apertis overlap with existing open-source projects, in which people have already spent a lot of time understanding a particular problem-space, making mistakes (for instance MPRIS version 1), and recovering from those mistakes. If we can learn from their mistakes, we can take a short-cut past the process of making and learning from our own mistakes, and arrive directly at a solution.
For instance, tinwell, the media player daemon, looks quite a lot like MPRIS (with some extensions); barkway, the notifications component, has a lot in common with org.freedesktop.Notifications; and lightwood/thornbury/mildenhall, the UI widget libraries, are solutions for the same sort of problem-space as Gtk, Mx, GNOME Shell's St, and even QtGui.
For some of those components, we might be able to adopt entire APIs from the existing project, perhaps with Apertis-specific extensions to fill in missing functionality. For instance, if tinwell provided an MPRIS2 interface, that would cover 90% of what it does, and would make sure we've avoided the problems that MPRIS1 had.
If our requirements prevent us from re-using existing APIs (for instance, we're not using Gtk for user interface widgets), the next best thing is to compare our APIs with the existing ones. Where they differ, there are several possibilities: perhaps our API specifically needs to be different to solve one of our requirements (in which case we keep it); or perhaps the difference doesn't really matter either way; or perhaps the difference points to a design issue in our API, which would mean we can improve it by correcting that design issue, and get a better outcome.