Chemistry software automatically generates computer code

A new software tool dubbed the Tensor Contraction Engine promises to aid scientists whose research has forced them to lead double lives as computer programmers.

A new software tool promises to aid scientists whose research has forced them to lead double lives as computer programmers.

The tool, called the Tensor Contraction Engine (TCE), is said to automatically generate the computer code that chemists, physicists, and materials scientists need to model the structure and interaction of complex molecules, saving them weeks or even months of work.

‘With this tool, scientists can focus on their research rather than writing and debugging software,’ said Ponnuswamy Sadayappan, professor of computer and information science at Ohio State University.

Sadayappan leads the consortium that introduced a prototype of the TCE on September 7 at the national meeting of the American Chemical Society in New York. Partners on the project include Louisiana State University, Oak Ridge National Laboratory, Pacific Northwest National Laboratory, and University of Waterloo.

Once the software is fully developed, it could impact two of the broadest areas of research in the physical sciences. Both computational chemistry and computational physics concern the behaviour of atoms and molecules on very large scales, and they encompass a diverse array of specialties, such as atmospheric chemistry, protein structure, materials science, and industrial chemical processing.

This research also consumes a great deal of supercomputer time around the country. In a recent study, Sadayappan and his colleagues reported that computational chemistry and materials science projects accounted for some 85 percent of computer usage at the Pacific Northwest National Laboratory, 30 percent at the National Energy Research Scientific Computing Center at Lawrence Berkeley National Laboratory, and 50 percent of one computer system at the San Diego Supercomputer Center.

The reason is that the interactions of atoms and molecules are so complex that scientists model them using elaborate mathematical matrices, or tensors, containing tens of millions to billions of elements. The modelling process involves dozens to hundreds of manipulations called tensor contractions, which are extremely complex and hard to program efficiently.

The job often falls to graduate students and post-doctoral researchers, who labour for months to write the code before scientists can begin to do any actual research.

Once fully developed, the TCE will perform the task in hours, generating an efficient parallel program that uses a minimum amount of computer memory and fast communication between parallel processors on a supercomputer, Sadayappan said.

Given a mathematical description of a problem in computational chemistry or physics, the TCE generates code in the FORTRAN. Scientists then plug that code into their own software programs.

Sadayappan got the idea for the TCE while collaborating on a particularly arduous electronic structure theory project with John Wilkins, an Ohio Eminent Scholar and professor of physics at Ohio State.

‘Some problems cropped up during the course of that work that made us realise the magnitude of the challenges involved, and by the time it was all over we had the idea for a way to make things easier,’ Sadayappan said.

At an annual workshop hosted by Russell Pitzer, professor of chemistry at Ohio State, Sadayappan discovered that some chemists had been thinking along similar lines. He joined with Pitzer and Gerald Baumgartner, an assistant professor of computer and information science with expertise in programming language design, and other chemists and computer scientists to form the consortium.

‘The success of such an endeavour requires a team with expertise in several disciplines,’ said Sadayappan, ‘and we are fortunate to have that – with world renowned quantum chemists Nooijen and Harrison, Bernholdt’s expertise in developing software interfaces, Ramanujam’s expertise in compilers, and Hirata’s ability to bridge computer science and chemistry, as evidenced by his prototype TCE.’

Sadayappan also emphasised the work that many postdoctoral researchers and students have contributed to the project, both in developing the ideas and implementing the software, which now contains almost 50,000 lines of code.

Now is the time for potential users of the TCE to join with the consortium and help shape the system’s functionality, Sadayappan said. Interested scientists should contact him ( to attend future project meetings and participate in dialogues concerning new features.