Allocations

Quick overview of steps needed to start computing on Summit

  • In order to run jobs on Summit, you need to be a member of an "allocation" of compute time. Allocations are used to facilitate the reasonable and fair sharing of our limited resources.
  • If you are a member of a CU Boulder research group that already has a project and allocation on Summit (including condo allocations), ask your PI to email us with permission for you to access to the appropriate project.
  • Otherwise, request access to the UCB General allocation by sending an email to rc-help@colorado.edu with a few sentences outlining your proposed use of Summit.

The general allocation is available for initial startup and testing as well as small-scale production work. Within one year, ensure that your research group has created its own Summit project and has requested an allocation of CPU time associated with that project. Only PIs can create projects.

Please read through the following sections for more information about projects and allocations.

Projects, General access, and Allocations

In this context a Project is a defined research effort and a container for compute time Allocations, as well as listings of expected goals and reported results. Projects require PI approval and supervision. For our purposes a PI is long-term research or teaching faculty and not a shorter term student, PostDoc, or visitor position.

Projects are required for use of RC resources as a means to document the PI supervision of computational work, the reporting of results, and most of all to facilitate the reasonable and fair sharing of our limited resources.

CU Boulder faculty/staff/student users will either be part of a PI supervised Project or they can initially get access to Summit via “General” access for up to one year. The “General” compute time allocation is provided so that UCB users may familiarize themselves with the system and do the preliminary work needed to inform a Project and Project Allocation.

Outside collaborators, visitors, etc. will need to be part of a Project and its related allocation in order to use Summit. This insures that the outside collaborator is supervised by a UCB PI/Faculty and is working on UCB research.

Projects may span multiple years, however compute time allocations associated with Projects will need to be re-evaluated yearly.

Users who are part of a Project can obtain compute time Allocations to increase priority and allow for greater throughput. There is no charge for the allocation however there is a process to request, review, and award these Allocations. Approximately 50 million SU are available to be allocated to about 50 UCB research groups via Project allocations each year.

Allocation requests will require prior use of Summit for testing, scaling and optimization work to inform the Allocation request. Unused hours from Janus will not carry over onto Summit.

We strongly urge you to email rc-help@colorado.edu with questions about the request process or to begin a collaboration on your Allocation request with RC. Doing so can help avoid wasted effort and delays in allocation approval.

General Access without a Project

  • (Startup/Temporary compute time allocation)
  • Requested via a portal (in development). In the meantime, email rc-help@colorado.edu to request access to the general allocation.
  • Easy to get for new UCB Summit users
  • Limited to a year
  • Impractical for larger scale operation
  • Results reporting required
  • Not available to Sponsored Affiliates, Visitors, external collaborators
  • Users can use about 10K SU per month with reasonable queue waits

Project

  • Approximately 5 year maximum duration
  • Reasonably specific, not a general request for group resources for a variety of goals
  • Container for compute time Allocations
  • Administered via a web portal
  • Supervised by PI “Long-term” faculty
  • Establishes clarity of longer term research effort
  • Organize expected and reported results (papers, degree awards, etc).
  • Requires annual confirmation of Project continuation and user list.
  • Required for Sponsored Affiliates, Visitors, and external collaborators

Project Compute Allocation

  • Establishes an amount of available compute time to project members, requested as Service Units or “SU” (based on core-hours) but awarded as a % share of the system.
  • Rewards users who work within their budget with increased priority on Summit.
  • Typically allows for more throughput and shorter wait times than users in General
  • Jobs continue to run at a lower priority even if the % share is overused
  • General-like testing allocations possible when conditions warrant, i.e. visitors or users who have already had a year unsupervised General access but need to test for a new Project
  • 50 million SU available per year to allocate to about 50 UCB research groups

Project Description - Include this information when creating a Project

  • Title (Unique, descriptive, about 60 characters or less)
  • PI (Must have a Research Computing account)
  • Concise description of the research project and its significance
  • Estimated Project timeframe/duration
  • Concise description of the related computational work
  • Goals (typically degrees, publications, presentations, data products)
  • Grants that support this project, specifically grants that are directly related to work performed on RC resources
  • General description of the data storage, sharing and management for the project (and beyond if applicable).
  • Create a new project using the web portal

Allocation Request Worksheet

This worksheet can help you develop your Allocation request proposal. You can use it as a template or to check your proposal to determine if it is complete. Please address all of the bullet points in the worksheet. If you feel that some of the requested information is not applicable, please note why that is rather than leaving that section out of your request entirely.

Requests for larger allocations will need more detailed justification. As a general guideline, a request for 300K SU might require a two-page proposal, while a 1M+ SU request might require three to five pages.

Why do we ask for so much detail? It's not to create unnecessary irritating work for you! It helps to ensure that CU's heavily subscribed HPC resources are being used properly and fairly. The allocation request process gives you the opportunity to ensure that you have a clear plan for using Summit, that your workflow is appropriate for the resources you plan to use, and that your application is running efficiently. Recently, a number of groups have realized improvements of 4x or greater in overall efficiency by working through the steps in this allocation worksheet. That immediately leads to greater research productivity!

Introduction and summary

  • Concise Description - Describe the portion of the Project that this computational work supports
  • Allocation goals - Describe the anticipated goals for this particular effort as a subset of the Project goals.
  • Duration: indicate if this allocation is for one year or until completion of a nearer-term goal whichever is sooner.
  • Expected followup - indicate if this is the final allocation for the Project or if work will likely continue.

Computational method

  • Describe the application(s) that you will be running
  • Details of application optimization
    • If application is an RC-provided module please indicate; in that case you don't need to provide further details about application optimization.
    • If application has been optimized and detailed in a previous Allocation Request please indicate the specifics or paste in those details with attribution.
    • Describe how the computational algorithm was optimized; note whether optimized numerical libraries such as Intel MKL are used; note any compiler optimization flags used.
    • When was optimization and scaling testing performed, and by whom? Please include information such as Job IDs and/or username and dates.
  • Workflow optimization
    • For parallel applications, show how the total job time changes as more cores or nodes are used (ie, provide scaling information)
    • Describe how nodes are fully utilized in terms of memory, CPU or both.
    • Describe how the workflow was structured to fit the resources, walltime limits, etc.

I/O

  • Describe the disk I/O by job type
  • Temporary files - indicate the size and number of job specific temporary files and how/if they are removed.
  • Local vs scratch - Usage of on-node local disk vs /scratch
  • Output files - Describe the nature, size and number of output files that remain after job completion.

Data Management

  • Describe how much data this effort will produce, per job and overall.
  • How much of that is temporary "raw" output which will be post processed and then deleted.
  • How much will need to be migrated off scratch to safe storage. (Recall that scratch filesystems are purged at regular intervals and thus can only be used for temporary storage.)
  • Describe this "safe" storage (RC PetaLibrary, department file server, etc.)

CPU time ("SU") request

  • Break down the estimated number of jobs of each type and the cores and walltime they require.
  • Indicate the Billing Weight modifier for the resource type. The different Summit nodes (Haswell,KnL,GPU and high memory) are charged or “billed” against your award differently based on actual node purchase cost.
  • Multiply and total to determine your SU request
  • See the Summit Partitions in our User Guide for more details on the various Summit nodes
Example table for calculating total SU requirement
Node Type Jobs Cores Hours Weight Total
shas 100 24 12 1 28,800
sgpu 2.5
smem 50 12 4 6 14,400
sknl 0.1
Grand Total 43,200

Project allocation requests should be submitted through the relevant project in the web portal.