RSC Group

RSC group is the leading Russian developer of liquid-cooled supercomputer solutions. The Bitworks team helps RSC to create unique and innovative supercomputing environment management systems that facilitate and bring deployment, diagnostics and cluster monitoring to a whole new level of availability.

In May 2015, RSC started searching for a contractor to develop a next-generation system for supercomputer cluster management. RSC's choice of our team for this project was largely influenced by our expertise including both the development of complex distributed systems and the development, deployment, configuration and maintenance of heterogeneous software and hardware infrastructures (software-defined data warehouses and networks, the deployment and configuration of OpenStack- and Cloudstack virtualization clusters, etc.). We were able to quickly grasp the high-level needs of the customer and suggest comprehensive solutions that met those needs.

 

For the purpose of the project, our team is implementing a software information system that allows supercomputer center engineers to perform the following tasks:

  • primary and secondary (complete or partial) one-click deployment of fully managed supercomputing environment (CPU, Xeon Phi, GPGPU nodes);
  • efficient monitoring of clusters arranged in different topologies to discover faults, find anomalies, allocate resources and request servicing of faulty components;
  • cluster management via user interface over the BMC and NMC interfaces and software-defined agent interfaces;
  • the aggregation of cluster resource operation and load metrics (syslog, BMC, NMC and other agents) as well as the classification and visualization of those for monitoring and reporting purposes;
  • cluster resource accounting (CPU hours, energy consumption, storage resources, VDI environment resources) for monitoring, accounting, and supercomputer center subscriber billing;
  • system-wide centralized logging of all cluster components in a storage, with event search and visualization capabilities for auditing purposes;
  • HPC application management environment integrated with the Slurm resource planner;
  • VDI management environment integrated with OpenStack;
  • cluster resources billing;

It should be noted that RSC independently designs and develops supercomputers, from electronics (excluding system and network chips and CPUs) to hardware (cooler PCBs, racks, server racks, PSUs, liquid cooling stations, etc.), which allows the achievement of unprecedented energy efficiency and component density. For example, a standard RSC rack has an energy consumption of up to 500 kW h, which requires novel management mechanisms that are not found in "normal" air-cooled supercomputers and classic data centers. A fully manageable intelligent RSC cluster including numerous supplementary microcontrollers intended for interoperation via IoT principles; this approach is the only way to achieve optimum operation of entire computational infrastructure and maximum ROI on the equipment.

 

The software that we develop for RSC is one of a kind, both in Russia and globally. Potential users show great interest in the demos of the software currently showcased at different professional events.

Company web site: http://www.rscgroup.ru/

 

Following technologies are used in project

  • Frameworks
    • ZeroMQ
    • Apache Mesos
    • Angular 2
    • Tornado
  • Data management
    • Apache Zookeeper
    • Redis
    • ElasticSearch
    • PostgreSQL
  • Languages
    • JavaScript
    • Python
  • Integration
    • Jenkins
    • Docker
    • Puppet
  • General
    • Test driven development
    • Continuous Integration
    • Git-flow