LQCD Homepage

LQCD Home

QDCOC Computing

Lattice Archives at BNL

Contacts

User Information


Accessing QCDOC

Command Line Allocator
(replaces the Web Allocator)

Call Tracking System (CTS)
(Account is required)

User Guide

Batch System

File Transfers to/from BNL

Transfering Files between US LQCD Sites
(FNAL Link)

US LQCD Common Runtime Environment


CRE: Setup
(CRE_HOME, setup scripts etc. )

CRE: Filesystems
(QDATA, QCACHE, QSCRATCH etc.)

CRE: Interactive System
(Compilers, Libraries, devel. tools, etc.)

CRE: File Management
(qsplit, qunsplit etc.)

CRE Definition (pdf)
(as of June 14th, 2006)

Machine Status


Web Display (Under Construction)
(Allocation status of all available partitions)

QCDOC Status (USDOE only)
(Partitions, Jobs DB, etc.)

Batch System: Current Status
(Available Queues, Running Jobs, etc.)

Errors Database
(DB of ASIC and Wire errors.)

New Users


Computer Accounts

Accessing QCDOC

CTS accounts

CyberSecurity Training

RBRC Users Mailing List

USDOE Users Mailing List


Internal Links
(Available to QCDOC Admins Only)

QCDOC Computing at BNL

Brookhaven National Laboratory (BNL) currently hosts two large QCDOC machines: one for the RBRC community and the other for the US Lattice Gauge Theory community (image). In addition there are four Air Cooled Crates (ACC) with Single Motherboard (64-node) partitions (image) available for testing and debugging and two Single Slot Back Plane (SSBP) used by BNL techs.

Each QCDOC machine consists of 12288 processing nodes (ASICs) hosted in twelve water cooled racks (1024 nodes each) with a peak performance of 10 Tflop (see image). ASICs, designed by our collaboration and built by IBM, are interconnected in a six-dimensional, low-latency mesh network with the topology of a torus. Each has a 4MBytes Embedded DRAM and a 128MBytes external DRAM and is currently running at 400Mhz. More information about this architecture can be found on the QCDOC architecture and publication web pages.

Front-End hosts and Remote Access

The front-end node of each QCDOC (qcdochosta for the RBRC community, qcdochostb for the US LQCD community) provides the physical connection to the machine partitions via multiple network interfaces. Users cross-compile their codes and manage the machine partitions on the front-end node.
The front-end hosts can be accessed remotely via ssh gateways (ssh.qcdoc.bnl.gov). Since the QCDOC machines reside in a network enclave, even users within BNL need to go through the ssh gateways (ssh.qcdoc.bnl.local).

Available File Systems

The "host" filesystem is globally shared by all processing nodes. It is usually provided by the front-end host (500GB) but it may also be provided by NAS linux servers.

The parallel file system (pfs) is used for high IO throughput from the processing nodes. It is provided by NAS linux servers; one per machine rack for the RBRC machine, two per rack for the US LQCD machine. It is similar to scratch disks on cluster processing nodes. Each qcdoc nodes writes to a unique directory on the pfs systems. All pfs systems are NFS mounted on the corresponding front-end host.

One of ten national laboratories overseen and primarily funded by the Office of Science of the U.S. Department of Energy (DOE), Brookhaven National Laboratory conducts research in the physical, biomedical, and environmental sciences, as well as in energy technologies and national security. Brookhaven Lab also builds and operates major scientific facilities available to university, industry and government researchers. Brookhaven is operated and managed for DOE's Office of Science by Brookhaven Science Associates, a limited-liability company founded by Stony Brook University, the largest academic user of Laboratory facilities, and Battelle, a nonprofit, applied science and technology organization.
Privacy and Security Notice