High Performance Computing in the Arts and Humanities

by John Bonnett, Geoffrey Rockwell, and Kyle Kuchmey

 

High Performance Computing (HPC) is distinguished from desktop computing in two ways.
  1. The first is the speed with which it can process information. These days, High Performance Computers are actually not single computers, but clusters of computers, or Central Processing Units, that are networked together with high speed fibre optic lines.  Because HPC clusters offer hundreds if not thousands off-the-shelf microprocessors to process a file, they can - with a suitable application - process files much more rapidly than desktop computers
  2. HPC is further distinguished by the scale of files that it can process.  In the sciences, HPC is used to model complex systems ranging from galaxies to meteorological systems and DNA.  Desktop systems cannot process models currently supported by HPC clusters because the file sizes are too large.  HPC currently supports tera-scale levels of computing and the first peta-scale systems are being installed.  In principle, this means that HPC clusters can perform trillions of calculations per second on trillions (1012) of bytes of data.1

In the next few years, the next generation of HPC clusters will come on line.  These systems will be capable of supporting peta-scale computing, meaning they will perform quadrillions of calculations per second on quadrillions of bytes of data (1015).  The National Center for Supercomputing in Illinois, for example, received funding from the National Science Foundation to build the Bluewaters petascale computing system.  It will be completed in 2010.

History

High Performance Computers – or Supercomputers – have been extant since the 1960s, and three approaches have been used to design them. The first approach, implemented during the 1960s relied on scalar processors, processors that processed one data element at a time.  During the 1970s, the second approach based on vector processors emerged.  Vector processors have the capacity to simultaneously process multiple data items.  Prior to the late 1980s, single machines designed by innovators such as Seymour Cray dominated the supercomputing market.  The Cray-2, for example, was the world’s fastest computer from 1985 to 1989, and was used by institutions such as NASA.  In the late 1980s, however, the concept of massive parallel processing emerged, based on networks of off-the-shelf hardware.  This third approach both met the performance benchmarks established by the Cray Computer Corporation and its competitors, and did so in a way that was more cost effective.  Given its cheaper cost, most institutions now rely on clusters of computers as opposed to a single supercomputer to support their high performance computing needs.2

Why is it important?

Put simply, HPC presents researchers in all disciplines with more computational power than they have ever had access to before. The implication of this development is that it enables researchers in multiple disciplines to undertake research in areas they could not before. Researchers in the sciences have been the first to exploit HPC. Many researchers, for example, use it to pursue "Grand Challenge" research problems, computationally intensive research projects designed to support economic development, policy formation and fundamental research. Similar research initiatives are being pursued in Europe, Japan, China, India, Canada and elsewhere. HPC simulations, for example, have been designed to assist health officials in formulating responses to crises such as an outbreak of avian flu or smallpox. HPC is also being used to support research in projects relating to the development of fusion energy, hypersonic aircraft and superconductors. In the domain of fundamental research, HPC is also being used to infer the biological function of DNA gene sequences, the prediction of earthquakes and the discovery of astronomical phenomena from telescope imagery data..3

What applications have been identified for HPC in the humanities?

Until recently, most scholars in the humanities saw little need for High Performance Computing.  Most were content with the research, analytical and expressive methods they had at their disposal.  Most did not see how the computational power or cyber infrastructure associated with HPC could make a meaningful contribution to their research. That stance, however, is beginning to change.4 Humanities researchers have identified two potential application areas for HPC:  text analysis and rich media. 

Researchers interested in text analysis point out that the Internet now presents humanities researchers with datasets that are orders of magnitude greater than anything they have ever had before, numbering in the billions of pages.  High Performance Computing will enable researchers to locate and aggregate relevant data from the multiple repositories arrayed on the Internet, and to detect significant patterns contained within assembled datasets.5 

Researchers interested in rich media suggest that HPC will support humanities research directed toward the development of Massive Multi-user Online Environments (MMOs), and equivalent platforms. An MMO is a repository of information spatially arranged.  It supports the generation, instantiation, dissemination and documentation of content of all sorts.  In many ways, its functions are equivalent to that of the book.  It is distinguished, however, by the forms of representation it supports.  Instead of text and number, it will support heterogeneous forms of representation that combine text, sound, 2D, 3D and 4D objects.  These forms will be used to represent objects such as cities, creating extremely large datasets that will require HPC clusters to support their operation.   If humanities scholars mean to exploit the analytical and expressive potentials that multimedia MMO environments present, they will need to do two things.  They will need to create expressive and attestive conventions to govern the use of multi-media objects in 4D environments.  They will also need to create workflows to govern the generation, documentation and peer review of scholarly content. Both tasks will require research.6    

Examples

Project Project Description Discipline Type of HPC
       
MONK Growing out of the NORA Project, Monk is a collaborative effort through several North American universities, focused on expanding the possibilities of text analysis through visualization of text in a 3-dimensional space. Text Analysis Visualization
       
Gridcast The Belfast e-Science Centre (BESC) and the BBC are finding solutions to storing and organizing vast archives of video content Media & Communications Mass Storage, Parallel Processing
       
Spirited Ruins Boston University's HiPArt project utilizes HPC to generate this interactive 3-D space. Visual Art Visualization
       

More Project Examples

 

What constraints hinder effective exploitation of HPC in the humanities?

In HPC computing, projects are traditionally run on a queued batch basis, meaning you submit your project, the HPC cluster or clusters compute it, and then the results are returned to you. There are humanities scholars who could operate comfortably in such an HPC regime.  Humanities researchers who wish to undertake meaningful research in MMO environments or aggregated text collections, however, will require interactive access to HPC resources on a long-term basis.  To realize such a research vision, HPC networks will need to work with the digital humanitists to develop the cyberinfrastructure to support long-term interaction with virtual environments and digital text copora, in addition to batch processing of research projects.  SHARCNET is working with digital humanists to prototype such cyberinfrastructure.

What HPC infrastructure exists in Canada?

HPC research in Canada is supported by seven regional networks:  ACENET,7 CLUMEQ,8 SCINET,9 HPCVL,10 RQCHP,11 SHARCNET,12 WESTGRID.13  In November 2006, the Canada Foundation for Innovation (CFI) announced its financial support for an initiative to create a national High Performance Computing (HPC) platform, and a new organization to govern it: Compute Canada.14  The seven regional networks are now federated into a national platform to support HPC research in Canada.

How can I learn more about HPC in the Humanities & SHARCNET facilities?

 

________________________________

 


1 Intel.  “Tera-scale Computing Research Program.” On-line at:  http://techresearch.intel.com/articles/Tera-Scale/1421.htm [June 26, 2008].

2 “Supercomputer” in Wikipedia:  The Free Encyclopedia.  On-line at:  http://en.wikipedia.org/wiki/Supercomputer [June 29, 2008]; “Seymour Cray” in Wikipedia:  The Free Encyclopedia.  On-line at:  http://en.wikipedia.org/wiki/Seymour_Cray [June 29, 2008]. 

3 “Grand Challenge.” In Wikipedia:  The Free Encyclopedia.  Available on-line at:  http://en.wikipedia.org/wiki/Grand_Challenge [June 29, 2008]; “Grand Challenge problem.” In Wikipedia:  The Free Encyclopedia.  Available on-line at: http://en.wikipedia.org/wiki/Grand_Challenge_problem [June 29, 2008]; Chris L. Barrett, Stephen G. Eubank and James P. Smith. “If Smallpox Strikes Portland:  ‘Episims’ unleashes virtual plagues in real cities to see how social networks spread disease. That knowledge might help stop epidemics.” In Scientific American.  February 2005.  Available on-line at:  http://www.sciam.com/article.cfm?id=if-smallpox-strikes-portl [June 29, 2008]; “A short history of high-performance computing:  From supercomputers to elastic and cloud computing.” Available on-line at: www.cc.gatech.edu/classes/AY2008/cs7270_fall/7270-lect16-cloud.ppt [June 29, 2008]. 

4 At least nine contributions have emerged in recent years that explore the implications HPC presents for the future of humanities research. See “Links about Digital Humanities and HPC.” Available on-line at https://www.sharcnet.ca/dh-hpc/index.php/Links_about_Digital_Humanities_and_HPC [June 29, 2008]. 

5 Geoffrey Rockwell, Ian Lancashire, Ray Siemens. “Large-Scale Text Analysis.” In Compute Canada – A Proposal to the Canada Foundation for Innovation – National Platforms Fund.  Pp. 52-54. Available on-line at: http://www.c3.ca/ce/NPF-FINAL.pdf [June 29, 2008]. 

6 John Bonnett. ““High-Performance Computing:  An Agenda for the Social Sciences and the Humanities in Canada.” at SSHRC Website:  Social Sciences and Humanities Research Council.  Available on-line at:
http://www.sshrc.ca/web/about/publications/computing_final_e.pdf  [January 2007] (Posted January 2007).

7 Atlantic Computational Excellence Network. Members: Dalhousie U., Memorial U., Mount Allison
U., St. Francis Xavier U., St. Mary's U., U. of New Brunswick. U. of Prince Edward Island. Soon to join:
Acadia U., Cape Breton U.

8 Consortium Laval UQAM McGill and Eastern Quebec for High Performance Computing. Members:
McGill U., U. Laval, UQAM, and all others branches and institutes of 1'Universite du Quebec: UQAC.
UQTR, UQAR, UQO, UQAT, ETS, ENAP and INKS.

9 SciNet. Member: University of Toronto.

10 High Performance Computing Virtual Laboratory. Members: Carleton U., Loyalist College, Queen's U.,
Royal Military College, Ryerson U., Seneca College, U. of Ottawa.

11 Réseau québécois de calcul de haute performance. Members: Bishop's U., Concordia U., Ecole
Polytechnique, U. de Montreal, U. de Sherbrooke.

12 Shared Hierarchical Academic Research Computing Network. Members: Brock U., Fanshawe College,
U. of Guelph, Lakehead U., Laurentian U., Sir Wilfrid Laurier U., McMaster U., Ontario College of Art
and Design, U. Ontario Institute of Technology, Sheridan College, Trent U., U. of Waterloo, U. of Western
Ontario, U. of Windsor, York U.

13 Western Canada Research Grid. Members: Athabasca U., Brandon U., Simon Fraser U., U. of Alberta,
U. of British Columbia, U. of Calgary, U. of Lethbridge. U of Manitoba, U. of Northern British Columbia,
U. of Regina, U. of Saskatchewan, U. of Victoria, U. of Winnipeg.

14 Compute Canada will replace c3.ca, a national organization that was founded in 1997 to advocate high performance computing in Canada. The new organization will function both as an advocacy group, and as
a governing body to ensure the proposed platform functions as a national infrastructure. Currently, c3.ca comprises 50 member institutions, and represents the HPC interests of thousands of Canadian researchers.
As conceived, Compute Canada will be comprised of researchers, the seven consortia, and universities.

 
© 2006 Shared Hierarchical Academic Research Computing Network