Intel Readies Research Papers On Programmable Multicore Architectures - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Software // Information Management

Intel Readies Research Papers On Programmable Multicore Architectures

Three of the papers analyze three characteristic future multicore applications, including one on the concept of a "data center-on-a-chip."

Intel on Tuesday said it would release this week eight technical papers describing key findings from the company's work on future programmable multicore architectures.

The papers will be published in the Intel Technical Journal and will provide details on how the company expects future microprocessors with simplified parallel programming models to evolve. With the commodity server market moving quickly toward increasingly more powerful multicore processors, new tools are needed to help programmers develop software that can take full advantage of the platforms.

What's different in developing software for multicore environments is the need for parallel programming, which is the divvying up of tasks from an application among multiple processors, and having them perform the work simultaneously. The complexity of such an environment requires different development tools than the ones typically used today.

Three of the papers analyze three characteristic future multicore applications, Sean Koehl, technology strategist for Intel, said in the company's blog. One looks at the concept of a "data center-on-a-chip." Researchers are looking at the possibility of running an e-commerce data center with 133+ processors on a single system based on a 32-core tera-scale processor. Each core would have four threads capable of taking advantage of a technique called simultaneous multithreading. SMT improves overall efficiency by permitting multiple independent threads of execution of tasks.

The paper proposes changes in the memory architecture in order to balance all the processing in such a powerful system. The changes include a model for a hierarchy of shared caches; a new, high-bandwidth L4 cache; and a cache quality of service to optimize how multiple threads share cache space.

The other two papers demonstrate parallel scalability for two model-based applications: realism in games and movies, and home multimedia search and mining. The papers, however, also point to the need for more cache/memory bandwidth, which would be provided by a large L4 cache.

Two other related papers are more hardware focused. One covers packaging and integration of the L4 cache, and the other on-die integration of many cores. The first discusses how providing high-bandwidth memory would eventually require memory to be built right on top of the die, which is the integrated circuitry of a chip. "Our Assembly and Test Technology Development division are evaluating possible options to achieve this," Koehl said.

The second paper discusses how Intel might design and integrate caches shared between cores, and also explores the on-die interconnect mesh and other noncore components that would be integrated, such as memory controllers, input/output bridges, and graphics engines.

Another paper proposes a specific architectural change that would accelerate applications using many threads. Specifically, Intel is proposing the in-hardware implementation of a function called task scheduling, which is the mapping of work to cores for execution. The software-based methods used today introduce too much overhead for use in highly parallel workloads.

Finally, the remaining two papers cover new hardware/software innovations that are in development at Intel to simplify parallel programming. One involves the integration of non-Intel Architecture accelerator cores, such as media accelerators. Because the accelerators have different instruction sets, they would require different compilers, tools, and knowledge bases developed for IA programming. The paper outlines architectural extensions, language extensions, and runtime to extend IA for handling accelerators.

The other paper addresses the tailoring of runtimes to the special environment of tera-scale platforms. "Runtimes designed to enable efficient, low-overhead use of the many cores and threads on a tera-scale processor will be critical for software scalability," Koehl said.

The runtime presented, called McRT, provides support for fine-grain parallelism and new concurrency abstractions that ease parallel programming. "Results show how an application using this runtime stack scales almost linearly to more than 64 hardware threads," Koehl said. "McRT provides a high-performance transactional memory library to ease parallel programming by allow the programmer to often avoid error-prone and hard to scale locking techniques."

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
10 Ways to Transition Traditional IT Talent to Cloud Talent
Lisa Morgan, Freelance Writer,  11/23/2020
Top 10 Data and Analytics Trends for 2021
Jessica Davis, Senior Editor, Enterprise Apps,  11/13/2020
Can Low Code Measure Up to Tomorrow's Programming Demands?
Joao-Pierre S. Ruth, Senior Writer,  11/16/2020
White Papers
Register for InformationWeek Newsletters
Current Issue
Why Chatbots Are So Popular Right Now
In this IT Trend Report, you will learn more about why chatbots are gaining traction within businesses, particularly while a pandemic is impacting the world.
Flash Poll