LQCD Home
QDCOC Computing
Lattice Archives at BNL
Contacts
Accessing QCDOC
Command Line Allocator
(replaces the Web Allocator)
Call Tracking System (CTS)
(Account is required)
User Guide
Batch System
File Transfers to/from BNL
Transfering Files between US LQCD Sites
(FNAL Link)
Columbia Physics System (CPS)
(COLUMBIA UNIVERSITY Link)
| US LQCD Common Runtime Environment |
CRE: Setup
(CRE_HOME, setup scripts etc. )
CRE: Filesystems
(QDATA, QCACHE, QSCRATCH etc.)
CRE: Interactive System
(Compilers, Libraries, devel. tools, etc.)
CRE: File Management
(qsplit, qunsplit etc.)
CRE Definition (pdf)
(as of June 14th, 2006)
Web Display (Under Construction)
(Allocation status of all available partitions)
QCDOC Status (USDOE only)
(Partitions, Jobs DB, etc.)
Batch System: Current Status
(Available Queues, Running Jobs, etc.)
Errors Database
(DB of ASIC and Wire errors.)
Computer Accounts
Accessing QCDOC
CTS accounts
CyberSecurity Training
RBRC Users Mailing List
USDOE Users Mailing List
|
Common Runtime Environment: Filesystems
Definitions/Mappings
Definitions/filesystem mappings on QCDOC:
- $HOME=/home/$USER
- $QCACHE_USER=/cache/users/$USER
- $QCACHE_PROJECTS=/cache/projects/< project name>
- $QDATA=/host/$QMACHINE/$USER
- $QSCRATCH= Not defined
- Interactive node (IN)= qcdochostb.qcdoc.bnl.gov
- Batch System Executions Node (BSEN)= qcdochostb.qcdoc.bnl.gov
- Compute Node (CN)= QCDOC Compute Node
- Master I/O Node (MIN)= Node 0 of a partition. All QCDOC CN are I/O capable.
$HOME
- Every user has a unique home directory (/home/UserName) that is automatically
created when a user account is setup.
- Home directories are only available to the Interactive/Batch execution Node node.
On QCDOC, home directories are not mounted on Compute Nodes.
- Home directories are not meant to store large data files. They provide
the disk space to edit, compile and store code/scripts.
- They are being backup up (daily incremental, monthly full).
- On QCDOC, the total disk space available to home directories is currently 1.6TB.
- The env. variable $HOME is available to both interactive (command-line) and batch scripts.
$QDATA
$QDATA provides a filesystem for staging data for a parallel job.
- QDATA (aka: the host filesystem on QCDOC) provides a staging area that is shared between
the Interactive Node (IN) and all the QCDOC nodes
(althought the CRE requires that QDATA is mounted at least on a
nominated Master IO Node (MIN), on QCDOC QDATA is NFS mounted to all
compute nodes).
- QDATA does not provide disk space for long-term storage. It is only meant to
be used for IO for the duration of the job. Users should move their data to a more permanent
location (see QCACHE) when the job has finished.
- The env. variable $QDATA points to a location that depends on the machine partition.
Different machine partitions use different staging filesystems.
For a given machine partition, QDATA is /host/$QMACHINE/$USER, where $QMACHINE is the
machine partition, and $USER is the username.
- In batch scripts $QDATA is set by the PBS init scripts.
QCACHE ($QCACHE_USER, $QCACHE_PROJECT)
QCACHE provides a filesystem for storing computational results over a long time.
- QCACHE is a large filesystem (currently 3.3TB, easily expandable) for storing computational results
over a long time; may act as a front-end for tertiary storage (archive systems, tapes, etc.).
- It is NFS mounted on the front-end node, thus it is available to interactive and batch jobs.
It is not available on Compute Nodes.
- QCACHE is not being backed up. At BNL, it is mirrored ("rsync-ed") once a day to a filesystem of the same size.
- It is split into two separate env. variables $QCACHE_USER (pointing to user specific storage)
and $QCACHE_PROJECT (pointing to project specific storage).
- Env. Variable $QCACHE_USER is set to /cache/users/$USER once the appropriate setup
script is sourced ($CRE_HOME/bin/setup.(c)sh).
- Currently, all QCDOC users belong to the qcdoc group. Thus, the env. variable $QCACHE_PROJECT
is set to /cache/projects/qcdoc for all users once the appropriate setup
script is sourced ($CRE_HOME/bin/setup.(c)sh). We plan to refine in the future.
- The setgid s bit is set on the directory $QCACHE_PROJECT. Files created under $QCACHE_PROJECT
have the group id of the directory. The same is true for directories, but they also have the setgid bit set.
- Env. variables $QCACHE_USER and $QCACHE_PROJECT are available in batch scripts.
$QSCRATCH
The env. variable $QSCRATCH is not defined for QCDOC.
Each Compute Node (CN) is expected to be able to write to its own private high performance
filesystem (scratch area) indicated by the env. variable $QSCRATCH.
In the case of QCDOC all nodes in a Crate (512 nodes) share the same
fileystem, know as the Parallel File System (PFS).
Each node writes to its own directory specified by the
node's physical coordinates; for example: /R16/C0/B1/M2/D22/A0.
From the point of view of a parallel program the path to the $QSCRATCH area is the same
for all compute nodes. In QCDOC this path is /pfs. Although the path
is the same for every compute node, each node is actually writing to a separate directory.
- PFS systems are provided by Linux file servers, one for each crate (a total of 24 pfs
systems for the entire USQCD QCDOC machine).
- All 24 PFS sytems are NFS mounted on the Interactive Node (IN) qcdochostb under:
/pfs/rRcC, where R is the Rack (16-27) and C is the
Crate (0 or 1); for example: /pfs/r22c0/
- Total disk space available per crate (512 nodes) is 1.4TB.
- The PFS systems provide a transient area for storing temporary files;
users should remove
their data files from the pfs systems as soon as a job finishes.
|