
Managing large and complex datasets is a key process in many areas of scientific, academic, cultural, social, and economic activity. On the technical side, it was necessary to create a set of core data management services based on an adequately scalable, high-capacity, efficient, and ubiquitous infrastructure. Such an infrastructure should enable the migration or replication of data of cross-disciplinary or timeless importance, as well as the hosting of newly acquired datasets for scientific or economic use.
The infrastructure should also guarantee reliable security for this data and ensure efficient access through an extensible set of access services and data presentation mechanisms supporting various access protocols and data formats. Furthermore, the service should integrate functionality and resources for data processing, including High-Performance Computing (HPC, HRC), Big Data analytics, and machine learning (ML, AI), without the need to move data to a different infrastructure.
The disk storage infrastructure at PCSS consists of several types of data storage and sharing systems, including mid-range disk arrays, high-performance disk arrays for HPC systems, specialized storage and sharing systems—including high-performance file servers and SSD arrays—as well as open, scalable disk server clusters with Software Defined Storage software. These systems can be used to meet various needs, including storing and sharing data for the HPC cluster (temporary data space), maintaining cloud platforms (volumes for virtual machines and containers), implementing cloud storage applications and services (data synchronization and sharing services), and securely storing backups and long-term storage of archival data in a distributed architecture, as well as providing general-purpose storage services: for applications and platforms with block, file, and object interfaces.
PCSS data storage systems also allow for the handling (storage, serving, streaming) of extensive digital content, large files, and objects, including multimedia (high-resolution audio-video), digital objects (stored and shared in repositories), and data acquired as part of research projects in disciplines requiring the management of extensive datasets from high-quality instruments (e.g., data from radio telescopes, high-resolution scans of animal species).
PCSS offers services for creating, storing, and restoring backups and archival data (Backup/Archive) based on tape systems equipped with disk buffers on arrays. Currently, PCSS primarily uses tapes and drives in Jaguar technology (IBM), managed by the Tivoli Storage Manager monitoring system with HSM (Hierarchical Storage Management) functionality. The hierarchical storage structure provided to users consists of many elements, ranging from ultrafast SSD ARRAYS and SSD/NVME memories, through storage supporting parallel computing users, to tape storage, with a total capacity of 80 PB.
PCSS resources are made available for the following tasks:
- Increasing the computing power of OPI PIB along with improving user access to ICT systems managed for the Ministry of Science and Higher Education (MNiSW).
- Computing grants for the scientific community in Poland under the MNiSW SPUB (Special Research Device Program) subsidy.
The data infrastructure was partially financed by MNiSW as part of investments related to education and scientific activity, within the expansion of the KDM infrastructure for the Poznan scientific community (subsidies no. 7414/II/SP/2023, 7512/II/SP/2024, P/II/SP/0353/2025/06).