National Center for Macromolecular Imaging
Workshop Home

CIBR Mini-workshop on Parallel Computing on Linux Clusters

June  14, 2012



Steve Ludtke
Baylor College of Medicine


Steve Ludtke
*administrative contact
Please note: This mini-workshop will assume participants have a working knowledge of command-line Unix (Linux, MacOS or other) and basic programming concepts.

Linux clusters have become the de-facto standard for large-scale scientific computing, yet using them is considerably different than working with desktop computers. Free cluster compute time is available to CIBR faculty, and additional free compute time can be acquired from national resources such as Xsede (previously called Teragrid). Making effective use of these resources requires basic training in the capabilities, limitations and tools provided by typical Linux clusters. Full training in cluster computing generally requires several days of intensive training. In this mini-workshop we will cover basic clustering concepts, including:

  1. Cluster Architecture:
    • node capabilities
    • inter-node communications
    • external communications

  2. Cluster Tools/Methods
    • queuing systems
    • MPI
    • threads

  3. Rate-Limiting Factors
    • disk I/O limited jobs
    • network I/O limited jobs
    • memory limited jobs

  4. Strategies for Maximizing Performance
    • effective use of scratch disks
    • local caching
    • ramdisks

At the very least, participants should leave the workshop with the ability to assess their own programs, their suitability for use on clusters, and how to determine whether a particular cluster is sufficient for a particular computation. In addition, participants should be able to run cluster-enabled software on existing linux clusters, and have a general idea of how to approach converting standalone programs into cluster-enabled programs.