Monday, June 6, 2011

History of libvirt's ESX support

This is the story about how I got involved with libvirt.

I'm a computer science student at the University of Paderborn. An essential part of the masters program is a one-year project group. I participated in a project group about using virtualization technologies in a supercomputer environment. The targeted basis for our system was a supercomputer that runs different hypervisors, ranging from open-source hypervisors like Xen and KVM to commercial and closed-source hypervisors like VMware ESX and Microsoft Hyper-V.

One of our major tasks was to develop a system that allows to manage large amounts of virtual machines. We evaluated different approaches and decided to use libvirt as the hypervisor abstraction layer in our system, because it handles several open-source hypervisors already. My task in this project group was to extend libvirt with support for VMware ESX. The actual programming started in February 2009.
Because libvirt is written in C, the first problem that I had to solve was: How to manage a VMware ESX server from C? VMware provides several APIs to manage an ESX server, ranging from commandline tools over C libraries to full-fledged SOAP-based APIs. A good choice would probably have been the VMware VIX API, because it's implemented as a C library. But this library is closed-source and that doesn't mix very well with an open-source project like libvirt. Therefore, the decision was made to develop the ESX support using the SOAP-based VMware vSphere API.

This decision led to the next problem: How to handle SOAP in C? SOAP is an object-orientated protocol, that doesn't map directly to C. VMware provides Java bindings for the vShere API, but that doesn't help here. The typical way to get bindings for a SOAP-based API is to feed its WSDL file to a generator. There are some generator tool that can generate C bindings from a WSDL file. But each one has it's specific set of problems: csoap seems to be dead, Axis2C doesn't generate code for types that inherit from other types and gSOAP doesn't handle inheritance well either. This inheritance handling problems only occur when generating C code. For example, gSOAP generates perfectly working C++ code for the vSphere API. Another problem is the size of the generated C code. Axis2C generates over 100 MB of C code and gSOAP generates over 20 MB of C code for the whole vSphere API.

This led to the conclusion, that existing code generation tools cannot be used to generate C bindings for the vSphere API. Therefore, I've written custom C bindings using libxml2 for the XML deserialization and libcurl for the HTTPS transport. This fits well with libvirt, because it already depends and libxml2 and GnuTLS for its own XML handling and remote transport encryption. GnuTLS can be used by libcurl to handle the SSL part of HTTPS.

The custom C bindings started as a proof-of-concept and were overhauled some time ago to use macros for code generation. Later on I started to write a Python script that generates this macro-based C code from a simple object definition file. Currently this generator handles almost all vSphere API object types and methods used for the ESX support.

This custom vSphere API C bindings and the ESX support based on them are part of libvirt since version 0.7.0 (August 2009) and are under ongoing development since then. Finally we also wrote a paper about Non-intrusive Virtualization Management using libvirt that was published at the DATE conference in 2010.