Software Management

Scientific software has adopted agile development practices and its development cycles have been impressingly shortened. New features and utilities are continuously implemented and it results in new increments of the software included in quick releases.

The main goal of accelerating development cycles is to provide new solutions and quick fixes to the end-user as soon as possible. From the point of view of the e-Infrastructure and the usage of containers, this means that we need to provide an alternative service where to build and store containerized software and have it available and ready-to-run transparently for end-users.

Several components are part of the Software Management component:

  • Marketplace

  • Gitlab

  • Containers

  • Sregistry

From the point of view of end-users only the Marketplace is relevant. The marketplace is where they can browse and purchase software for solving their experiments.

From the point of view of developers, Gitlab, Singularity and Sregistry satisfy all the agile and automated development workflow, from the code to the execution of experiments, to allow users to take advantage of brand new software versions running with the best performance. The Marketplace is the place where developers provide their applications to end users.

1. Marketplace: Provide or purchase your software

The Marketplace manages the “products” lifecycle. It is an archival infrastructure to store, provide and categorize MADFs, pilots, benchmarks, models, etc. That is, registration of new apps, offerings, purchases… It is based on the Fiware Business Ecosystem.

All users can access to the MarketPlace. It shows the list of offered applications and a left menu to manage the product inventory and stock. Using the marketplace, the end-user can buy the different apps that are available. After adding one to the cart, the user can go to the shopping cart and checkout the application, paying for it using paypal if it is not free.

Developers take the role of software suppliers in the MarketPlace. In addition to what an end-user can do using the MarketPlace, developers can create new products and offer them to be discoverable and purchased by end-users. Products can have a given price, to be paid through paypal, or be free. Once a product is purchased it will be usable from the Experiments tool.

2. Gitlab: Repository and CI/CD/CD

Providing a software repository accessible from a single place (the portal) helps to homogenize applications usage, integration with the portal, and to increase the visibility and the impact of the provided data and applications. The repository based on Gitlab is deployed as a standalone service in the Cloud (gitlab.srv.cesga.es).

The repository provides a bunch of technologies, tools and features in order to ease management, development, documentation, integration and delivery of MADFs and Pilots. From the point of view of the e-Infrastructure, the most relevant features provided are the version control tracking system (based on Git), community-oriented communication tools (wikis, issue trackers, code snippets, etc.), a Docker container registry and a scalable continuous integration service.

The core components of the repository are the version control system and the tools around it. With them, developers can create new repositories or import/mirror already existing repositories and track and manage the evolution of the development of their software projects. They can use their own computers to work before submitting changes or directly use their browsers. It provides graphical tools for code revision, changes acceptance/rejection, etc. Development teams can perfectly integrate their coding workflows with it as GitLab empower developers with confidentiality and permissions management. They can create developer teams and manage visibility, roles and permissions per repository and team.

One of the biggest strengths of the software repository implementation is the continuous integration feature. The CI service automates software projects building and testing. It helps to check the correctness of new developments for their approval and ease early error detection and quick bug fixing.

As scientific software building and testing can be costly in terms of time and resources usage, the CI service scalability is of utmost importance. In addition, each software project can need a completely different environment where to be built. This is why continuous integration has been deployed taking advantage of CESGA Cloud system (OpenNebula) and Docker-machine. This means that every continuous integration process occurs inside a Docker container and in a separated virtual machine. This configuration results in a highly scalable and flexible solution.

For the particular case of repositories of web content, CI service also provides Continuous Deployment of web sites by means of Static Site Generators as Jekyll. This enables developers to also publish up-to-date content, like documentation, according to the latest status of their software.

3. Containers: Singularity

Traditional software deployment and integration consists in performing all the needed steps, like download, configure, build, test and install, to have the software project natively in a production infrastructure. The main goal of this task is to provide the software with good performance and ready to use for end users.

Scientific software and, in particular, mathematical frameworks are extremely complex from an architectural point of view. They are usually composed of numerous mathematical concepts and features implemented along several software components in order to provide high level abstraction layers. These layers can be implemented in the software project itself or integrated via third party libraries or software components. The whole environment of each mathematical framework is usually composed of a complex dependency matrix and, at least, one programming language and compiler.

Container technology has become very popular as it makes application deployment very easy and efficient. As people move from virtualization to container technology, many enterprises have adopted software container for cloud application deployment. There is a stiff competition to push different technologies in the market. Although Docker is the most popular, to choose the right technology, depending on the purpose, it is important to understand what each of them stands for and does. MSO4SC has chosen Singularity as container technology to provide portable and high performance containers.

Singularity was designed focusing on HPC and allows to leverage the resources of whatever host in which the software is running. This includes HPC interconnects, resource managers, file systems, GPUs and/or accelerators, etc.

Singularity was also designed around the notion of extreme mobility of computing and reproducible science. Singularity is also used to perform HPC in the cloud on AWS, Google Cloud, Azure and other cloud providers. This makes it possible to develop a research work-flow on a laboratory or a laboratory server, then bundle it to run on a departmental cluster, on a leadership class supercomputer, or in the cloud.

4. SRegistry: Container storage and transfer

Having a Cloud service for storing and transferring Singularity images let us to provide a central location where to offer ready-to-run containerized software in a homogeneous way. It also enables developers to implement portable workflows for their containerized applications instead of manually copying the images to every HPC system and hardcoding paths in their workflows definitions.

We have deployed a Cloud service based on SRegistry with this purpose (sregistry.srv.cesga.es). SRegistry is a service to simply store and transfer Singularity images. It allows managing the published images visibility, but it’s configured to keep images private by default unless the user make it explicitly public. Currently this service is under heavy development and adding new features very quickly. The current status only enables completely public workflows, but we expect to enable private images workflow and have all features we need ready and in production very soon.

Integrating this service together with the continuous integration service let to provide a continuous delivery service. This means that after building the container during the CI this container will be send and persistently stored and having it available for all users.