2019-09-11

Ballerina Programming Language - Part 0: Context

Well, it's been 9 years since my last blog post. It's been an eventful period on real life: I got married, we have two children, I became a Thai citizen, built a house and had major back surgery.

For the last 18 months, I have been working on the design of a new programming language called Ballerina. Version 1.0 of Ballerina has just been released, so now is a good time to start explaining what it's all about. In subsequent posts, I will delve into the technical details, but in this post I want to provide some context: the "who" and the "why".

TL;DR Ballerina was designed to be the core of a language-centric, cloud-native approach to enterprise integration. It comes from WSO2, which is a major open source enterprise integration vendor. I have been working on the language design and specification. I think it has potential beyond the world of enterprise integration.

The main person behind Ballerina is Sanjiva Weerawarana. I've known Sanjiva since around 1999 (20 years!), when we were both on the W3C XSL WG doing XPath and XSLT 1.0. Sanjiva at that time was working for IBM Research (where his boss at one point was Sharon Adler, who I had worked with on the ISO DSSSL committee).

This was the era of peak XML, before JSON was invented, and people were using XML for all sorts of things for which it was not very well-suited, including SOAP and the whole Web Services stack built on top of that. Sanjiva worked on several important parts of that including WSDL and BPEL.

Around 2005, Sanjiva decided he wanted to leave IBM and start a company with some fellow IBMers. He is from Sri Lanka, and wanted to go back. At that time, I was working for the Thai government. I had persuaded them to start an open source promotion activity, and I was running that for them (one day I should write a blog about that).

On Boxing Day 2004, there was a huge tsunami in the Indian Ocean, which was a disaster for several countries including Thailand and Sri Lanka. As part of the recovery process, the Thai government had organized an international IT conference in Phuket at the beginning of 2005. Sanjiva came to talk about Sahana, which was an effort started in Sri Lanka to use open source to help with recovery from the tsunami.

On the sidelines of the conference Sanjiva pitched me the idea for the company, at that time called Serendib Systems (the word serendipity comes from Serendip, which is an old name for Sri Lanka). The idea was to do open source related to web services, based in Sri Lanka. It was at the intersection of a number of my main interests at the time (XML and open source in developing countries), and I had confidence in Sanjiva, so it wasn't a hard decision to invest.

The name was changed to WSO2 (WS as in web services, O2 as in oxygen), Sanjiva took the role of CEO and I joined the board. WSO2 has grown steadily in the 14 years since it was founded, and now has about 600 employees. It has remained an open source company and it has developed a comprehensive open source enterprise integration platform. You may well never have heard of WSO2; we have always been rather better at the technical side of things than the marketing side. But we are actually a major vendor in the open source enterprise integration space, with lots of global Fortune 500 customers. In fact, there’s some Gartner report that says we are the world’s #1 open source integration vendor, although I’m not quite sure on what metric.

For quite some time, the workhorse of enterprise integration has been the Enterprise Service Bus (ESB). An ESB sends and receives network messages over a variety of transports, and there is a configuration language, typically in XML, that describes the flow of these messages. The configuration language can be seen as a domain-specific language (DSL) for integration. It supports abstractions like mediators, endpoints, proxy services and scheduled tasks, which allow a given message flow to be described at a higher-level than would be possible if the equivalent code were written in a programming language such as Java or Go. ESB products (including WSO2's) typically include a GUI for editing the configuration language. The ESB's higher-level abstractions allow for a much more useful graphical view than would be possible with a solution that was written in a programming language.

The fact that an ESB is not a full programming language has important consequences. It means that at a certain point you fall off a cliff: there are things you simply cannot express in the XML configuration language. ESBs typically deal with this by allowing you to write extensions in Java. In practice, this means that complex solutions are written as a combination of XML configuration and Java extensions. This creates a number of problems. First, the ESB is tied to Java. 10 years ago that wasn't really a problem, but increasingly Java is the new COBOL. The cool kids are interested in Go, TypeScript or Rust and would not even consider Java. Oracle's stewardship of Java does not help. Second, the Java extensions are a black box as far as the graphical interface is concerned. Third, multiple languages creates additional complexity for many aspects of the software development process: build, deployment, debugging. Fourth, it's bad in terms of the cognitive load that it places on the developer team: the developers have to learn two quite different languages, and continually switch gears between them.

The other fundamental problem with the ESB concept is that is designed for a centralized deployment model. The idea is that the IT department of an enterprise runs the Enterprise Service Bus for the entire enterprise. It is not only the large footprint of an ESB that pushes in this direction, but also the licensing model: ESBs are typically not cheap and are licensed on a per-server basis. If you think of the XML configuration language as a domain-specific programming language, and of the ESB as the runtime for that language, you in effect have one large program, controlling integration across the entire enterprise. Furthermore, this program is not written in a pleasant, modern programming language, with support for modularity, but is rather just a pile of XML. As you can imagine, this is not good for agility or DevOps.

This is the background that led to the creation of Ballerina. The high-level goal is to provide the foundation of a new approach to enterprise integration that is a better fit for current computing trends than the ESB. Obviously, the cloud is a hugely important part of this. The Ballerina concept evolved over a number of years. I see three stages:

  1. Let’s do a better DSL that looks more like a programming language!
  2. Let’s make it full programming language!
  3. Let’s take a shot at becoming a mainstream programming language!

Stage 2 marks the start of the Ballerina project, and was when the name was chosen; that happened in August 2016.

My first involvement with Ballerina was at the beginning of 2017, when Sanjiva asked me to help with the design of the language support for XML. But I only started to get really deeply involved in Ballerina in February 2018. At that point there was already a working, proof-of-concept implementation. Sanjiva asked me to help write a language specification.

When I started, we did not think it would take all that long for me to write a specification. We were completely wrong about that! It's been 18 months already, and it is still a work-in-progress. What happened is that as we dug into the details of the language, it became apparent that there was a lot of scope for improvement in the design. The job turned out to be more about refining and evolving the language design, rather than just documenting what had been implemented. As it became clearer than the goal was to try eventually to become a mainstream programming language, so the quality bar for the implementation needed to be raised.

Sanjiva's primary area of expertise is distributed systems, and WSO2's collective expertise is centered around enterprise middleware, rather than programming language design and implementation. When they started the Ballerina project, I think they underestimated the enormity of the project that they had taken on, as did I to some extent. As I have been wrestling with the Ballerina language design, I have gained a much better appreciation of just how hard programming language design is. I have looked at many other programming languages for inspiration. I've been incredibly impressed by how good the current generation of programming languages are. I would particularly highlight TypeScript, Go, Rust and Kotlin. Each of them has a very different language concept, but every one of them has done an amazing job of designing a programming language that realizes their concept. I take my hat off to their designers.

I should say something about what a version of 1.0 means. I should first explain first we make a distinction between the implementation version and the specification version. 1.0.0 is the implementation version. Language specifications are labelled chronologically (it's a living standard!). The 1.0.0 implementation is based on the language specification labelled 2019R3, which means the 3rd release of 2019.

1.0 does not mean that we have got either the language design or implementation to where we want it. If we lived in a world unsullied by commercial or competitive reality, we could easily spend a couple of years extending and improving the design and implementation. But WSO2 is not a huge company, and we have already made a very substantial investment in Ballerina (of the order of 50 engineers over 3 years). So we need to get something out there, so that we can get some proof points to justify continued investment. The benchmark for 1.0 is whether it works better for enterprise integration than our current ESB-based product. It needs to be sufficiently stable and performant that we can support it in production for enterprise customers.

We also have a reasonable degree of alignment between the language specification and the compiler: what the compiler implements is a subset of what the specification describes, with a couple of caveats. The first caveat is that there are some non-core features that are not quite stable. These are labelled "preview" in the specification. We expect to stabilise these soon, and that will involve some minor incompatible changes. The second caveat is that the implementation has some experimental features, which are not in the specification; we plan that the language will eventually include features that provide similar functionality.

The language design described by the current specification has two fundamental features that are unique (at least not part of any mainstream programming language). Its combination of other features is also unique: each feature is individually in some language, but no language has all of them. I think the language design is interesting not just for enterprise integration, but for any application which is mainly about combining services, whether consuming them or providing them. As things move to the cloud, more and more applications will fall into this category. Although the current state of the language design is interesting, I think the potential is even more interesting. Over the next year or two, we will stabilize more of the integration-oriented language features, which will make Ballerina quite different from any other programming language. Unfortunately, it takes a lot of work to get the general-purpose features solid and that has to be done before the more domain-specific features can be finalized.

Overall, the 2019R3 language design and the 1.0 implementation are an initial, stable step, but there is still a long way to go.

In future posts, I will get into the design of the language. In the meantime, you can try out the implementation and read the specification. The design process was initially quite closed, but has gradually become more open. Most of the discussion on the spec happens in issues in the spec's GitHub repository. Major new language features have public proposals. Comments and suggestions are welcome; the best way to provide input on is to open a new issue.

See the next post in the series.

7 comments:

Unknown said...

Welcome back, interesting to hear what you're working on these days. We are embarking on a major SoA business transformation project soon so this is timely!

Thad Guidry said...

ugh, Standard library (batteries) not included. I'll pass.

James Clark said...

The standard library is part of the Ballerina platform and is documented here.

Liam Quin said...

Suggest adding an RPM package for RHEL and Centos, as these are very common in server contexts.

The description of the type system reminds me a bit of EQLOG from the 1980s, a logic programming language where type safety was computed by the compiler based on type axioms.

I think also the value of examples and tutorials in increasing review & deployment should not be underestimated.

Sanjiva Weerawarana said...

RPM will come soon as will a zip distro so anyone can download and run it.

There are lots of examples here: https://v1-0.ballerina.io/learn/by-example/. We also have a series of long tutorials and will be pushing them out soon.

Liam Quin said...

Thanks, Sanjiva.

Anonymous said...

Agreed with Liam. Ballerina-by-Example is really impressive, but the spec itself should include examples and notes.