Data Curation Preservation Issues (Facilities, Digital repository systems and High performance computing)

 

The rapid growth of digital research data has transformed the way knowledge is created, shared and preserved. Across disciplines, researchers generate vast quantities of data through experiments, simulations, observations and computational analyses. As a result, data curation and preservation have become critical components of the research lifecycle. The key factors influencing successful data curation are the availability of appropriate facilities, robust digital repository systems and advanced computational infrastructure such as high-performance computing (HPC). However, each of these components presents unique challenges that institutions must address.

One of the fundamental requirements for effective data preservation is the availability of adequate facilities and infrastructure. Research institutions need reliable storage environments, backup systems, secure networks and disaster recovery mechanisms to protect digital assets (Masenya & Ngulube, 2019; Shah et al., 2021). However, maintaining such facilities can be costly and technically demanding. According to Zareef and Jabeen (2025), many institutions struggle to balance increasing data volumes with the infrastructure needed to manage them effectively. Without secure storage facilities and redundancy mechanisms, digital collections are at risk of corruption, accidental deletion or catastrophic loss (Shah et al., 2021).

Closely linked to physical and digital infrastructure are digital repository systems which serve as platforms for storing, managing and disseminating research data. Repositories play a crucial role in ensuring that datasets remain discoverable, accessible and reusable over time (Shah et al., 2021). Effective repositories support metadata creation, version control, access management and preservation workflows. However, repositories often lack clear governance structures resulting in inconsistent metadata practices and poor preservation planning (Rothfritz et al., 2026). This reduces discoverability of data and hampers long-term usability of stored content. Again, keeping repository platforms such as DSpace or EPrints updated is a recurring problem, leading to security and usability issues.

The increasing use of high-perfomance computing (HPC) has introduced both opportunities and challenges for data curation. HPC systems enable researchers to process and analyze massive datasets generated through scientific simulations, artificial intelligence applications and data-intensive research (Almeida & Okon, 2025). While these systems significantly enhance research capacity, they also produce unprecedented volumes of data that require effective management and preservation strategies. Several technical and organizational obstacles continue to affect the integration of HPC and digital preservation systems. These include insufficient storage capacity, difficulties in transferring large datasets, lack of standardized metadata practices and inadequate coordination among researchers, information professionals and information technology specialists (Almeida & Okon, 2025; Yoon et al., 2025). As Arms (2008) observed, modern science has entered a data-intensive era in which data volumes often exceed traditional storage and management capabilities. As a result, institutions must develop integrated approaches that connect HPC environments with data repositories and preservation infrastructures.

To address these issues, institutions are increasingly adopting best practices such as implementing trusted digital repositories, utilizing cloud-based storage solutions, applying FAIR (Findable, Accessible, Interoperable and Reusable) data principles and investing in scalable cyber infrastructure (Wilkinson, 2016). Advances in cloud computing, automated metadata generation and distributed storage technologies are also helping institutions improve preservation efficiency and resilience.

In conclusion, facilities, digital repository systems and high-perfomance computing are essential components of modern data curation and preservation. While they provide the infrastructure needed to manage and safeguard valuable research data, they also introduce significant technical, financial and organizational challenges. Future success in digital preservation will depend on sustained investment in infrastructure, stronger collaboration among stakeholders and continued adoption of innovative technologies that support long-term data stewardship.

https://www.youtube.com/watch?v=64-mBFdWTtM

References

Almeida, F., & Okon, E. (2025). Assessing the impact of high‑performance computing on digital transformation: benefits, challenges, and size‑dependent differences. The Journal of Supercomputing. https://doi.org/10.1007/s11227-025-07281-z

Arms, W. Y. (2008). Cyberscholarship: High Performance Computing Meets Digital Libraries. 11(1). https://doi.org/http://dx.doi.org/10.3998/3336451.0011.103

Masenya & Ngulube. (2019). Digital preservation practices in academic libraries in South Africa in the wake of the digital revolution. 1–9. https://doi.org/https://doi.org/ 10.4102/ sajim.v21i1.1011

Rothfritz, L., Matthias, L., Pampel, H., & Wrzesinski, M. (2026). Current challenges and future directions for institutional repositories: A systematic literature review. An Annual Review of Information Science and Technology (ARIST) paper. Journal of the Association for Information Science and Technology.

Shah, U. A., Hussain, M., Saddiqa, M., & Yar, M. S. (2021). Problems and Challenges in the Preservation of Digital Contents: An Analytical Study. Library Philosophy and Practice. https://digitalcommons.unl.edu/libphilprac/5628

Wilkinson, M. D. (2016). Comment: The FAIR Guiding Principles for scientific data management and stewardship. 1–9. https://doi.org/10.1038/sdata.2016.18

Yoon, A., Kim, J., & Donaldson, D. R. (2025). Big data curation framework: Curation actions and challenges. Journal of Information Science, 51(1), 205–223. https://doi.org/10.1177/01655515221133528

Zareef, M., & Jabeen, M. (2025). Systematic literature review of digital curation services in academic libraries ( 2001 – 2023 ): A global perspective. Journal of Information Science, 1–29. https://doi.org/10.1177/01655515241305348

 

 

Comments

Post a Comment

Popular posts from this blog

SELECTION AND APPRAISAL OF DATA

DATA CURATION PRESERVATION ISSUES (ORGANIZATIONAL ISSUES)

DATA COLLECTION AND REPOSITORIES SUMMARY