KubeCon & CloudNativeCon NA 2018: Stability, CI/CD, Debugging Microservices, and Envoy Proxy (all the things!)

The Datawire team have returned from a very enjoyable KubeCon NA that was hosted in Seattle. It was a great experience getting to meet so many of our users, customers and community members again at our booth. We also managed to present several sessions and attend a few sessions, and so we learned a bunch of new things too. Here is a summary of our key takeaways.

Kubernetes and Envoy are Stable (“Boring”?) Technologies

Obviously with this being KubeCon the presenters were most likely a little biased, but many organisations and thought leaders echoed the same thing on the stage: foundational cloud native platform technologies like Kubernetes and Envoy proxy are stable and ready for general use (and may even be a little “boring”, in a good way!). On both the keynote stage and throughout the breakout sessions, large and traditionally technologically-conservative end user organisations like Walmart, T-Mobile and The Internet Archive were all discussing how they are using these technologies within digital transformation projects.

The biggest challenges remaining with these technologies appears to be the creation of effective tooling and control planes, and the integration into (or adaption of) existing developer workflows. For example, Groupon talked about how they had created a series of tools to effectively hide the day-to-day workings of Kubernetes, and Airbnb discussed how they had created tooling that provided shortcuts for working with Kubernetes but still displayed the underlying kubectl statements being issued.

Pinterest talked about how they enabled ops team to map existing routing and Varnish configuration to Envoy configurations using templates and manual curation, and we at Datawire explored our rationale for enabling application teams to write Ambassador annotations within Kubernetes Service configuration that is then used to generate Envoy configuration.

Integrating CI/CD into Your Workflow

It was clear from several talks and also the large amount of vendor booth space devoted to continuous integration and continuous delivery that this is very much an essential methodology if you are working with the Cloud Native space. What appears to be still up for debate is how to integrate associated practices within the developer workflow.

The Weaveworks team were talking a lot about the benefits of GitOps, for example, how Chick-fil-a are using this approach to manage all of their “edge” Kubernetes clusters that are deployed in each of their restaurants. The single source of truth (git) and declarative model of configuration (YAML) facilitates understanding of what is deployed where, and the security offered by the configuration “pull” approach is useful when regulations require restricted cluster access policies. We also like this concept, and have talked about this in relation to managing Ambassador using GitOps.

Closely related to GitOps was the ChatOps approach being discussed by the Atomist team. Here configuration is also defined using a declarative approach and stored within version control, and the primary difference is that all CD-related operations are event-driven and typically managed via Slack or Microsoft Team chat. This is closely related to the approach being developed with GitHub Actions, where workflows can be triggered by platform events (i.e. push, issue, release) which can in turn run a sequence of serial or parallel actions in response.

There was also plenty of chat around defining traditional build pipelines, but the focus was around doing so in a declarative fashion. The CloudBees team was talking extensively about the Kubernetes-native Jenkins X project, and how command line tooling like “jx” can be combined with creating and deploying branches (and pull requests) on-demand to namespaced “staging” clusters for experimentation and verification.

Debugging Microservices

Closely related to the topic of CI/CD and developer workflows was the discussion of how to debug microservices under development. Two general approaches appear to be emerging: active debugging and passive debugging.

The Datawire team have talked before about active debugging using our (CNCF-hosted) open source Telepresence tool, which allows you to “swap out” a service deployment on a remote Kubernetes cluster for a local version, and proxy all associated traffic from the remote cluster through the locally running service, which can be debugged using all of your typical locally run tools. You can also modify variables and the code during a run, “actively” debugging as you would with any other local application. The Solo team are proposing the same approach using their open source Squash tool. Phil Lombardi from the Datawire team also talked about the extended use cases with Telepresence, and how you can integrate this into an effective developer workflow for Kubernetes-based services.

Passive debugging, which arguably is similar to observability, is being proposed by the likes of the Rookout team. Rookout works by embedding a small debugging SDK within your service (or function), which allows the remote execution of stack trace generation and variable watches, and the addition of logging lines. You can’t dynamically modify any variables or code with this approach, hence the “passive” label.

Service Mesh and the Importance of the Edge

Istio was again front and center at KubeCon, and the fact that the project is sponsored by advertising-savvy Google and IBM, in combination with the ever-popular Envoy Proxy being used as the data plane, means that many engineers were talking about this. Anecdotally some challenges still remain with Istio, such as the operational burden of running this framework (which can somewhat be offset by using a hosted offering on GKE or Aspen Mesh’s distribution), and other offerings such as HashiCorp’s Consul Connect and Isovalent’s Cillium are carving out a niche, focusing currently on mTLS and network security policy enforcement, respectively.

The Datawire team had many interesting discussions around the importance of managing the edge, for example using the open source Ambassador API gateway, and many customers and community members had clearer business cases for implementing modern proxying and load balancing at the edge. Managing inter-service traffic was still typically important in these use cases, but getting traffic into the cluster in a manageable and dynamically modifiable way was top of their list of things to do when migrating to platforms powered by Kubernetes.

And, On to the Next One…

The entire Datawire team had a great time at KubeCon NA, and we wanted to thank the CNCF and the entire organisation committee (special hat tip to Liz Rice and Janet Kuo!). Thanks also to all of the speakers and people we chatted to at the booth. The community is what makes KubeCon so special!

We’re excited to learn that the CFP is now open for KubeCon EU, which will run in Barcelona, May 20-23. If you are looking for inspiration for your talk, or just want to look over what Datawire was up to at KubeCon in Seattle, you can find our slides on the Datawire SlideShare account, and the KubeCon videos are available via the CNCF’s YouTube account (and check out the direct links to “Effective Development with Kubernetes: Techniques, Tools & Telepresence” and “Intro: Telepresence” and “Deep Dive: Telepresence”, and “The Evolution of the AppDirect Kubernetes Network Infrastructure”).

See you in Barcelona in 2019!