HPC Policies and Procedures
UHD's Linux-based High-Performance Computing (HPC) environment has been implemented in the UHD data center to provide a collection of Linux-based nodes, common system, data management and networking services, and the capacity and performance objectives that researchers need to perform the complex computations associated with their work.
The Business Owner of the HPC environment is Dr. Benjamin Soibam, an Assistant Professor at UHD's College of Sciences and Technology, who partnered with a team from UHD Information Technology, and built the environment with the assistance of:
- Silicon Mechanics (implementation design)
- Bright Computing (provided software)
The HPC environment has been designed to use hardware and open-source software components that are commonly used in higher education research environments and is configured following a "Platform as a Service" ("PaaS") model. This model allows the HPC Business Owner to provision multiple "platforms", each a dedicated subset of the HPC environment's computing, storage, and networking resources, that can be allocated to a requesting research team to address its high-performance computing needs.
Generally, each platform is provisioned with a standard set of hardware and software components. However, the platforms will vary in capacity and throughput, depending on the requesting research team's needs. Additionally, each research team is expected to enhance its standard platform with its specialized software and data needed to successfully complete their project.
Purpose of the Computing Environment and Platform
To provide a high-performance computing environment to be shared by UHD researchers.
Governance
UHD's research computing will operate under the oversight of the steering committee to include:
- Provost
- CFO
- Associate Vice President, Research and Sponsored Programs
- Executive Director for Academic and Student Affairs
- CIO
- UHD ISO
The group will meet monthly or bi-monthly to review existing and upcoming projects presented by the CIO, establish or confirm priorities, and discuss operations and future directions. IT will also update the university's shared governance Academic Technology Committee with HPC activities and project updates and provide the committee's feedback and recommendations to the steering committee for their review.
Requests for research computing resources will be submitted through HPC's website and will be reviewed and completed by the coordinating faculty and in accordance to the governance document. All externally funded projects utilizing HPC must complete the operational and security assessment review by UHD IT and UHS ISO and be approved by the ORSP.
HPC operation will be managed by IT in coordination with the Office of Research & Sponsored Programs, the coordinating PIs, and the ISO.
Roles and Responsibilities
- Dr. Benjamin Soibam: o Receives requests for account creation. o Approves and creates accounts.
- JR Sears:
- User Support and Training
- Vince Esquivel, Ammad Khan, Franklin Phan, Chris Stewart, Dominic Brasted, Javier
Diaz
- System Administrators
- Handle specific customization requests for research platforms
- Backup and Hardware Updates
- Make sure the HPC is functional
- Work closely with Dr. Soibam on any updates
Account Requests
HPC is available for UHD faculty, staff, and students. In order to use the HPC system, you must request access.
- Eligibility
- Anyone who is affiliated with the University of Houston-Downtown and has a valid University of Houston-Downtown ID can request an account.
- Requests for accounts are submitted through the request form and are routed to Dr. Soibam.
- Requests for HPC accounts are screened and approved by Dr. Soibam.
- If approved, Dr. Soibam creates the accounts following UHD guidelines.
- New HPC users are urged to meet with a member of the HPC User Services to discuss their research and obtain assistance with HPC resources.
- Users are provided with a basic tutorial on how to connect to the HPC and how to run jobs on the HPC environment.
- Users will be trained in coding best practices.
- Account Provisioning
- When your account is provisioned, you will be provided with the following services:
- Access to the HPC computational clusters.
- A HPC home directory.
- Password expiration (complexity follows UHD guidelines).
- Users are instructed to change their passwords the first time they log in following UHD guidelines.
- Users' accounts are valid for the duration of the project.
- When your account is provisioned, you will be provided with the following services:
- Account Deprovisioning
- The account review process will be performed at the end of each semester for accounts that are no longer active.
- The users are notified of the pending deletion via email. Every attempt to notify the users through all known emails. The users will have 30 days' advance notice to retrieve any files they wish to preserve. The user must reply to the HPC request to remove their data.
- After 30 days have elapsed, HPC removes the user's account and the associated home directory. Files owned by the users are deprovisioned and data is lost even if the same user is re-provisioned in the future.
- Usage Policy
- HPC clusters require the use of Secure Shell (SSH).
- Do not leave the terminal unattended.
- You may not share your account with anyone under any circumstance.
- Protect your username and password and follow UHD campus password policies.
- If you suspect a security problem, report it promptly to security@uhd.edu.
- Account Security
- HPC group reserves the right to lock any account at any time.
- Account sharing is strictly prohibited by university policy.
- The account review process will be performed at the end of each semester for accounts that are no longer active.
- The users are notified of the pending deletion via email. Every attempt to notify the users through all known emails. The users will have 30 days advance notice to retrieve any files they wish.
- HPC group reserves the right to revoke access to and retain any data or files that may be used as evidence of a violation of any HPC Terms of use, security policy, or university regulation.
Account Creation
Accounts are created through HPC Cluster Manager Interface. The HPC group will provision the user account on the cluster using BrightView Interface.
Support Services
- Authentication: Users login via SSH using local LDAP service
- Change Management: UHD has created a formal process for making planned and unplanned changes to the
University of Houston-Downtown production IT environment. The intended scope of the
Change Management Process is to cover all of UHD's computing systems and platforms.
Any System that requires a change to the way we provide service will require this
process; such as implementing anything new that has major effects on our network,
system, and/or users, or performing scheduled or emergency maintenance. This will
ensure the day-to-day IT functions performed to provide effective change management
satisfy corporate governance audit requirements that ultimately reduce risk. In addition
to meeting all the audit requirements, the system provides a process for efficient
and prompt handling of all IT changes completed by the IT Department. The system includes
the following steps only after the application owner has been contacted and has approved
the changes:
- Submission: During this step, a change is identified and a change request is submitted. The change is evaluated, including determining the priority level of the service and the risk of the proposed change; determining the change type and the change process to use.
- Planning: Once the request is received and plan the change, including the implementation design, scheduling, communication plan, test plan, and roll-back plan.
- Approval: Obtaining approval for the Change Plan from management as needed. There will be 3 groups of people: 1) approvers, 2) those receiving FYI notifications, and 3) those filling out the online form (the person performing the maintenance work). We will need 3 lists from each director on who they would like to fill these 3 groups from each IT area.
- Implementation: Change is implemented after approval
- Request Review: Communicate and review Change Plan with peers and/or Change Advisory board regarding its success or failure and if the change resulted in a failure in service, define a mitigation strategy for future occurrences. Document all aspects of the change.
- Backup: Working in concert with HPC cluster software vendor Bright Computing and hardware vendor Silicon Mechanics, it was determined that the HPC cluster only requires the storage node to be protected. The storage node is backed up daily to our enterprise backup system in the One Main Building (OMB) data center and replicated upon backup completion to the Shea Building DR data center as well as to the Microsoft Central US Azure Data Center. Protected data is held on each cluster for 30 days and also archived to Microsoft's Central US Azure Data Center where it is retained for one year. In the event recovery is needed, restoration would occur from the OMB data center cluster. Should that system be unavailable, restoration would occur from the Shea DR data center. In the event both on-premise clusters are unavailable, restoration would occur from the cluster in Microsoft Azure. Should data older than 30 days be needed, restoration would occur from the archive in the Microsoft Central US Azure Data Center.
- Restoration Services: Request will be initiated by Dr. Soibam using the IT ticketing system. Requests are reviewed by IT leadership for approval. Once a request is approved the data/system state will be restored.
- Log Review: Logs showing unauthorized access to certain commands are reviewed by the HPC team using the log analyzer software and provided to the research team to review.
HPC Encryption Policy/Requirements
- This is an HPC cluster.
- The head node provisions all of the compute nodes over network boot.
- Compute node file systems are created remotely by the head node.
- Software images are copied from the compute node to the head node.
- The head node software is provided by Bright Computing in the form of an ISO file.
- The software is installed on bare metal directly from the ISO file.
- The system runs RHEL 7.x
- The storage node is encrypted
- The package manager is configured to access packages from RedHat and from Bright Computing.
Customization Request
Specific customization requests will be forwarded to Dr. Soibam. Authorization, implementation, and removal will be done together by Dr. Soibam and the system administrators after making sure all changes do not affect the HPC environment.
System Scans
ISO will regularly scan for vulnerabilities and any additional malware.