Redocly CLI (fka OpenAPI CLI)

What?

Redocly CLI (fka. OpenAPI CLI) is an open-source product for OpenAPI documentation development that I’ve been designing & developing while being a part of Redocly. It is one of the fastest OpenAPI/Swagger toolchains that uses the context-aware tree walking to achieve its speed & flexibility.

Eventually, we integrated redocly-cli into our API Docs CI/CD platform. It helped us win such customers as GitHub, Hewlett Packard Enterprise, and T Mobile.

Later on this linter made its way into a VS Code plugin finally enabling what we strived for in the beginning of the journey: a seamless in-editor experience with lightning fast feedback-loop for technical writers.

Why?

Quite a while back, when I had just started working at Redocly our CTO gave me a task to check if we could create our own OpenAPI linter. Why? We at Redocly had a strong believe (I share it to this day) that technical (especially API) documentation can benefit from the same approaches and tools as we (engineers) do. For example, storing the “source” in version control system, having CI/CD pipelines validate newly create docs, version releases, and ultimately use the same development environment as engineering colleauges. When using IDEs or modern editors we engineers have all the advantages of syntax highlighting, smart autocomplete, type checking, code style linting, and so on right at our fingertips. But at that moment situation for technical writers was very different: you could either use quite limiting GUI tools to design your API documentation, or venture into editing your OpenAPI/Swagger files and wait until your docs are built & rendered to check for any kind of a mistake. We wanted to change this status quo.

What we wanted to offer is a tool similar to linters we’d had for years in programming – a linter that could take your API definition, assemble it from multiple files, verify that it is syntactically correct, check for the defined (and very extendable) semantic rules (for example having examples for API calls in the same language, or adhering to the naming strategy), and finally suggest automated fixes when possible. Of course, all of the above mentioned steps should take less than 1 second to enable seamless “write -> check -> correct -> repeat” loop. On the other hand the memory footprint also needs to be rather small to enable running on lower end CI instances (both in our API documentation CI/CD product, but also for the open-source version).

At the time there was only one real competing product by the Stoplight.ioSpectral. We tried it out extensively and found out that the performance was not quite as good as we’d want – it tended to crash on API definitions larger then 10-ish MiB and tended to take > 10 seconds to process. So, while being a step in correct direction it was not quite there yet.

How?

I started by implementing a very simple approach:

  1. Parse OpenAPI definition into the JavaScript object
  2. Walk this object and at each “step” determine our path from the top of the object. Then lookup all the rules/validations that apply to given path pattern and execute them.
  3. Rinse and repeat for all the nested levels of the API definition

While quite simple this approach unfortunately proved to be quite inefficient and slow. Additionally, this approach made creating custom rules that needed information from several levels of the API definition quite cumbersome.

Then I finally realized, that we are not the first ones who are building a linter/processor for a machine readable language. So, why not go back to basics?

Indeed, there were existing parsers for YAML/JSON that would generate a proper syntax tree which we could then walk in our tool. There was an additional step to enhance the resulting syntax tree with type information, but with a well-defined schema of the OpenAPI standard it was quite easy to do. With this new approach I save myself from needing to always match stringified paths for each node, as instead the runner always knows what type of the node it processes (and can easily get rules’ actions registered for it). What followed of course was a lot of tinkering and minute optimizations, adding new rules, thinking about default values and so on. But ultimately using the “grown-up language parsing” approach allowed us to cut the runtimes for 5-10MiB API definitions down to under 1 second.