Popcorn Linux: Software for a Diverse World

By Robert Lyerly, Christopher Jelesnianski, Anthony Carno, and Dr. Binoy Ravindran


In 1965, Gordon Moore famously predicted that the number of transistors, tiny electrical switches which form the processing logic in computers, would double inside an area of silicon every year. Later amended to a forecast of doubling every two years, Moore’s law has continued to hold true for over half a century. In fact, development roadmaps from chip manufacturing giant TSMC show transistor shrinkages down to five nanometers, or about 20 times smaller than a flu virus. Through most of the 1990s and early 2000s these extra transistors were transformed into exponentially increasing computational capacity. Much of the performance increase was due to the ability of processor designers to accelerate single-threaded code, the execution of a single line of instructions within an application.

In recent years, however, the ability to do additional useful work with those transistors has slowed because of physical limitations (such as heat dissipation and power consumption) and as well as how computers execute an application’s instructions. This plateau could not have come at a worse time, as we have plunged headfirst into the age of big data and big computing. There is an insatiable need for more powerful computers in domains such as weather forecasting, disease epidemic modeling, business analytics, machine learning, and computer vision.

To continue increasing compute power, chip designers began duplicating processing units, known as processing “cores,” within a single chip. For example, the iPhone 7 has four processing cores so that the phone can simultaneously execute foreground tasks (dialing phone numbers, opening a webpage) and background tasks (checking for new text messages, updating the Facebook news feed). Using multiple cores allows chip designers to put those extra transistors to good use, letting the phone squeeze out additional performance by executing independent work in parallel. More recently, chip designers have begun specializing processors for different tasks—for example, graphics processing units are designed to render graphics much faster than central processing units, but are worse at other types of tasks such as web browsing. Specialization is the key to further increasing compute power, and there are significant investments being made in alternative architecture designs such as programmable network interface controllers, field-programmable gate arrays, and digital signal processors. Future systems will be composed of heterogeneous processors, similarly to a Swiss army knife built from a variety of tools.

Programming these diverse heterogeneous processors is difficult, because they have physically separate components (e.g., data caches) and they use different instruction-set architectures (ISA). A processor’s ISA is the set of instructions the processor understands, and defines the interface between the software and the hardware. One could think of a processor’s ISA as the language it uses to communicate with the outside world. This means processors from Intel (found in laptops, desktops, and servers) that “speak” the x86 language do not understand the ARM language used in smartphones and embedded devices. Programmers must build their applications using the language defined by the processor’s ISA to get the hardware to perform useful work.

To program heterogeneous processors, developers have traditionally had to break their applications up into pieces that can run on separate ISAs and manually select the best architecture for a given program piece. Imagine writing a book, except you must find and write each chapter of the book in the language that requires the smallest number of words. This is obviously a tedious and highly error-prone process, especially as developers add new features to their applications over years and even decades. In addition, the architecture choices baked into the application by the developer may be sub-optimal if the processor is running multiple applications. This could cause significant performance degradation by overloading one processor in the system while another processor sits idle.

This begs the question: how can Navy developers take advantage of the benefits offered by next-generation heterogeneous processors without rewriting applications from the ground up?

These emerging hardware trends pose a particular challenge for the Navy’s enterprise-class software systems—which include combat system software such as the Aegis weapon control system—as they undergo hardware refreshes in their current and emerging code baselines. In particular, such legacy software systems are large (several million lines of code), have significant degrees of complexity (concurrency, distribution, fault-tolerance), and have received significant investment in resources in their development and maintenance. Because of their large size, complexity, and investment, such codebases are rarely discontinued. Instead, they are continuously enriched with new functionality, patched to add new security features, ported onto new hardware, and maintained over long life cycles. This begs the question: how can Navy developers take advantage of the benefits offered by next-generation heterogeneous processors without rewriting applications from the ground up?

Emerging applications such as data analytics, machine learning, and high-performance computing consume ever increasing compute power. It is unclear, however, how these applications will run on new processors, which are diversifying in the face of diminishing returns in single-threaded performance.

The Popcorn Linux project (http://popcornlinux.org), spearheaded by Dr. Binoy Ravindran and the system software research group at Virginia Tech (http://www. ssrg.ece.vt.edu), aims to answer that question using a novel software infrastructure. The project is supported in part by the Office of Naval Research and NAVSEA/ Naval Engineering Education Consortium as part of an effort to future-proof naval software systems. They recently published a paper at the 2017 International Conference on Architectural Support for Programming Languages and Operating Systems detailing how to push per-processor hardware and ISA differences down into the infrastructure software on which applications execute, so that developers can focus solely on the application logic. The Popcorn Linux project uses a modified version of the open-source Linux operating system, which powers the vast majority of servers in datacenters as well as the popular Android smartphone. The Linux operating system is responsible for brokering access to the hardware in the system (your files, the internet, etc.), providing access for applications running on your computer. The operating system also is deeply intertwined with the underlying processor, as it must understand each processor’s capabilities and quirks to run the system. The operating system is structured, however, so that applications do not have to understand these low-level hardware details— they simply use a standardized set of interfaces to talk to the operating system, which executes hardware requests on the application’s behalf.

To allow applications to take advantage of heterogeneous processors, Popcorn Linux modifies standard Linux by running a separate instance of the operating system on each available ISA in the system. Separate instances of the Linux “kernel” (the core of the operating system with which applications interact) communicate and coordinate with each other. These kernels share information about the available hardware, which applications are running, and which applications should migrate between different processors. Applications running on this modified Linux use the same standardized operating system interfaces, but Popcorn Linux adds an extra layer of functionality that allow applications to request migration to other processors. When applications request access to a different processor, the operating system performs all the plumbing necessary to move the application and its data over to the new processor without any work by the application.

Simply moving running applications between different processors, however, is not enough to transparently support heterogeneous hardware. Applications are built using a piece of software called a compiler, a tool that converts programming languages (human-readable languages in which programmers write their applications) into 1’s and 0’s that computers understand. The compiler is responsible for taking the high-level instructions specified by programmers and implementing them using the instructions the processor understands—the ISA. When an application is “compiled,” the compiler generates a set of instructions specific to the processor on which the application is expected to run. In Popcorn Linux, a modified compiler generates these instructions for all available ISAs and arranges them so that the operating system knows how to find the correct version based on where the application is executing. Popcorn Linux includes a customized compiler based on LLVM, an open-source compiler used by many organizations including Apple, Google, and others.

Using Popcorn Linux, developers do not have to think about the details of how to migrate applications between heterogeneous processors—rather they only have to think about when to migrate.

The last piece of the puzzle relates to how applications execute on each different processor. Applications execute using a “runtime stack,” a small amount of temporary data necessary to drive the application forward. When building the application, the compiler sets up the runtime stack based on capabilities defined by the ISA. This means that an application’s runtime stack is customized for a single ISA and cannot be used as-is when running on other ISAs. To get around this issue, Popcorn Linux implements a small helper tool to convert the runtime stack between ISA-specific formats when migrating between processors. This helper tool is transparent to the application and is hooked in by the compiler at build time. When migrating, the tool attaches to the application, converts the runtime stack between ISA-specific formats, and then forwards the application to the operating system in order to migrate to a new processor.

Using Popcorn Linux, developers do not have to think about the details of how to migrate applications between heterogeneous processors—rather they only have to think about when to migrate. Developers do not have to manually copy data or switch between ISAs to continue execution, making it dramatically easier to experiment with and leverage different processors. Each processor has a design “sweet spot” tailored to certain types of application execution. Traditional Intel x86 processors found in desktops and servers are exceptionally good at running a small number of complicated tasks extremely quickly. More recently, manufacturers such as Qualcomm and Cavium have designed high-core-count ARM processors that excel at running many simpler tasks in parallel. Oftentimes applications may contain a combination of characteristics, meaning some pieces of the application are more suitable for one processor while other pieces are more suitable for another. Popcorn Linux enables developers to easily take advantage of the available heterogeneity in the system. Further research aims to remove the need for developers to select an architecture altogether—the system would analyze how applications execute and automatically select the most appropriate processor for the program.

The Popcorn compiler ingests source code and generates an application that, together with the Popcorn runtime stack converter and Popcorn operating system, can be migrated between processors of different ISAs.

In their paper, the Popcorn Linux team showcases a heterogeneous system containing Intel’s high-performance x86 central processing units and Applied Micro’s low-power ARM central processing units. After installing Popcorn Linux on the system, they evaluate running and migrating a set of applications versus a traditional single-ISA setup (i.e., containing two x86 central processing units). The results show that by using Popcorn Linux, datacenter operators could potentially achieve a 30-percent reduction in energy consumption by using the low-power ARM processor in conjunction with the high-performance x86 processor. The results hint that different combinations of heterogeneous hardware will allow developers to hit different design points— developers can pick and choose the hardware that best suits their needs, all without having to rewrite their applications.

Popcorn Linux’s benefits also apply to legacy naval applications without requiring thousands of man-hours to rewrite millions of lines of code. Programs such as the Aegis Weapons Control System can be migrated onto heterogeneous-ISA hardware with very minimal changes to source codes. This can yield significant savings in maintenance costs, the biggest cost driver in the software life cycle. In addition, the Popcorn Linux software stack can enhance application performance, which can result in significant improvements in many Aegis-specific metrics such as enhanced target tracking and faster engagement times. Popcorn Linux also could be used for security purposes—traditionally, attackers exploit application flaws to gain control inside of an already-running application. These exploits are most often specific to the ISA on which the application is executing. By switching between ISAs, would-be attackers and their hand-crafted exploits would be rendered useless. Using Popcorn Linux, Navy system administrators would be able to detect and react to attacks on the system.

The future of processor design is heterogeneous. Processor designers have begun creating specialized chips tailored to different types of tasks, but programming heterogeneous computer systems today is tedious and difficult for developers, especially for organizations such as the Navy that have a significant legacy code base. To enable easier application development and allow legacy applications to exploit the benefits offered by next-generation processors, the Popcorn Linux project moves ISA handling down into the software infrastructure. Applications can seamlessly take advantage of the benefits without the headaches of complex software design. This will allow the Navy to future proof software for future hardware refreshes.

About the authors:

Robert Lyerly, Christopher Jelesnianski, and Anthony Carno are graduate research assistants with the systems software research group in the Bradley Department of Electrical and Computer Engineering at Virginia Tech. Dr. Ravindran is the principal investigator of the group.