Third-party libraries are no party at all

What better way to end the week than with a hot take?

In my 8 years at Lyft, product managers or engineers have often wanted to add third-party libraries to one of our apps. Sometimes it’s necessary to integrate with a specific vendor (like PayPal), sometimes it’s to avoid having to build something complicated, and sometimes it’s simply to not reinvent the wheel.

While these are generally reasonable considerations, the risks and associated costs of using a third-party library are often overlooked or misunderstood. In some cases the risk is worth it, but to be able to determine that you first need to be able to define that risk accurately. To make that risk assessment more transparent and consistent, we defined a process of things we look at to determine how much risk we're incurring by integrating it and shipping it in one or more production apps.

Risks

Most larger organizations, including ours, have some form of code review as part of their development practices. For those teams, adding a third-party library is equivalent to adding a bunch of unreviewed code written by someone who doesn't work on the team, subverting the standards upheld during code review and shipping code of unknown quality. This introduces risk in how the app runs, long-term development of the app, and, for larger teams, overall business risk.

Runtime risks

Library code generally has the same level of access to system resources as general app code, but they don't necessarily apply the best practices the team put in place for managing these resources. This means they have access to the disk, network, memory, CPU, etc. without any restrictions or limitations, so they can (over)write files to disk, be memory or CPU hogs with unoptimized code, cause dead locks or main thread delays, download (and upload!) tons of data, etc. Worse, they can cause crashes or even crash loops. Twice.

Many of these situations aren't discovered until the app is already available to customers, at which point fixing it requires creating a new build and going through the review process which is often time intensive and costly. The risk can be somewhat mitigated by invoking the library behind a feature flag, but that isn't a silver bullet either (see below).

Development risks

To quote a coworker: "every line of code is a liability", and this is even more true for code you didn't write yourself. Libraries could be slow in adopting new technologies or APIs holding the codebase back, or too fast causing a deployment target that's too high. When Apple and Google introduce new OS versions each year, they often require developers update their code based on changes in their SDKs, and library developers have to follow suit. This requires coordinated efforts, alignment in priorities, and the ability to get the work done in a timely manner.

As the mobile platforms are ever-changing this becomes a continuous, ongoing risk, compounded by the problem that teams and organizations aren't static either. When a library that was integrated by a team that no longer exists needs to be updated, it takes a long time to figure out who should do so. It has proven extremely rare and extremely difficult to remove a library once it's there, so we treat it as a long-term maintenance cost.

Business risks

As I mentioned above, modern OSes make no distinction between app code and library code, so in addition to system resources they also have access to user information. As app developers we're responsible for using that information properly, and any libraries are part of that responsibility.

If the user grants location access to the Lyft app, any third-party library automatically gets access too. They could then upload that data to their own servers, competitors' servers, or who knows where else. This is even more problematic when a library needs a new permission we didn't already have.

Similarly, a system is as secure as its weakest link but if you include unreviewed, unknown code you have no idea how secure it really is. Your well-designed secure coding practices could all be undone by one misbehaving library. The same goes for any policies Apple and Google put in place like "you are not allowed to fingerprint the user".

Mitigating the risk

When evaluating a library for production usage, we ask a few questions to understand the need for the library in the first place.

Can we build this functionality in-house?

In some cases we were able to simply copy/paste the parts of a library we really needed. In more complex scenarios, where a library talked to a custom backend we reverse-engineered that API and built a mini-SDK ourselves (again, only the parts we needed). This is the preferred option 90% of the time, but isn't always feasible when integrating with very specific vendors or requirements.

How many customers benefit from this library?

In one scenario, we were considering adding a very risky library (according to the criteria below) intended for a tiny subset of users while still exposing all of our users to the library. We ran the risk of something going wrong for all our customers in all our markets for a small group of customers we thought would benefit from it.

What transitive dependencies does this library have?

We'll want to evaluate the criteria below for all dependencies of the library as well.

What are the exit criteria?

If integration is successful, is there a path to moving it in-house? If it isn't successful, is there a path to removal?

Evaluation criteria

If at this point the team still wants to integrate the library, we ask them to “score” the library according to a standard set of criteria. The list below is not comprehensive but should give a good indication of the things we look at.

Blocking criteria

These criteria will prevent us from including the library altogether, either technically or by company policy, and need to be resolved before we can move forward:

Major concerns

We assign point values to all these (and a few others) criteria and ask engineers to tally those up for the library they want to include. While low scores aren't hard-rejected by default, we often ask for more justification to move forward.

Final notes

While this process may seem very strict and the potential risk hypothetical in many cases, we have actual, real examples of every scenario I described in this blog post. Having the evaluations written down and publicly available also helps in conveying relative risk to people unfamiliar with how mobile platforms works and demonstrating we're not arbitrarily evaluating the risks.

Also, I don't want to claim every third-party library is inherently bad. We actually use quite a few at Lyft: RxSwift and RxJava, Bugsnag's SDK, Google Maps, Tensorflow, and a few smaller ones for very specific use cases. But all of these are either well-vetted, or we've decided the risk is worth the benefit while actually having a clear idea of what those risks and benefits really are.

Lastly, as a developer pro-tip: always create your own abstractions on top of the library's APIs and never call their APIs directly. This makes it much easier to swap (or remove) underlying libraries in the future, again mitigating some risk associated with long-term development.