De-duplication on the Backup System with Information Storage in a Database

Prevention of data loss from digital media includes such a process as a backup. It can be done manually by copying data to external media or automated on a schedule by using special software. There are the remote backup systems, when data are saved over the network to the remote repository. Such sys...

Full description

Saved in:
Bibliographic Details
Main Author: Sergey M. Taranin
Format: Article
Language:English
Published: Yaroslavl State University 2017-04-01
Series:Моделирование и анализ информационных систем
Subjects:
Online Access:https://www.mais-journal.ru/jour/article/view/510
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1839573306890518528
author Sergey M. Taranin
author_facet Sergey M. Taranin
author_sort Sergey M. Taranin
collection DOAJ
description Prevention of data loss from digital media includes such a process as a backup. It can be done manually by copying data to external media or automated on a schedule by using special software. There are the remote backup systems, when data are saved over the network to the remote repository. Such systems are multi-user and they process large amounts of data. Shared storage can meet files containing the same fragments. The elimination of repeated data is based on the mechanism of de-duplication. It is a method of information compression, when the search of copies is performed in the entire dataset rather than within a single file. The main advantage of using this technology is a significant saving of disk space. However, the mechanism of eliminating repetitive data can significantly reduce the speed of saving and restoring information. This article is devoted to the problem of implementing such a mechanism in the backup system with information storage in a relational database. In this paper we consider an example of implementation of such a system working in two modes: with the de-duplication of data and without it. The article illustrates a class diagram for the development of a client part of application as well as the description of tables and relationships between them in a database that belongs to the backend. The author offers an algorithm of saving data wiht de-duplication, and also gives the results of comparative tests on the speed of the algorithms of saving and restoring information when working with relational database management systems from different manufacturers.
format Article
id doaj-art-37f5d65ea10b40418a831d46c2a7113f
institution Matheson Library
issn 1818-1015
2313-5417
language English
publishDate 2017-04-01
publisher Yaroslavl State University
record_format Article
series Моделирование и анализ информационных систем
spelling doaj-art-37f5d65ea10b40418a831d46c2a7113f2025-08-04T14:06:37ZengYaroslavl State UniversityМоделирование и анализ информационных систем1818-10152313-54172017-04-0124221522610.18255/1818-1015-2017-2-215-226363De-duplication on the Backup System with Information Storage in a DatabaseSergey M. Taranin0P.G. Demidov Yaroslavl State UniversityPrevention of data loss from digital media includes such a process as a backup. It can be done manually by copying data to external media or automated on a schedule by using special software. There are the remote backup systems, when data are saved over the network to the remote repository. Such systems are multi-user and they process large amounts of data. Shared storage can meet files containing the same fragments. The elimination of repeated data is based on the mechanism of de-duplication. It is a method of information compression, when the search of copies is performed in the entire dataset rather than within a single file. The main advantage of using this technology is a significant saving of disk space. However, the mechanism of eliminating repetitive data can significantly reduce the speed of saving and restoring information. This article is devoted to the problem of implementing such a mechanism in the backup system with information storage in a relational database. In this paper we consider an example of implementation of such a system working in two modes: with the de-duplication of data and without it. The article illustrates a class diagram for the development of a client part of application as well as the description of tables and relationships between them in a database that belongs to the backend. The author offers an algorithm of saving data wiht de-duplication, and also gives the results of comparative tests on the speed of the algorithms of saving and restoring information when working with relational database management systems from different manufacturers.https://www.mais-journal.ru/jour/article/view/510filedatabackupde-duplicationdatabase
spellingShingle Sergey M. Taranin
De-duplication on the Backup System with Information Storage in a Database
Моделирование и анализ информационных систем
file
data
backup
de-duplication
database
title De-duplication on the Backup System with Information Storage in a Database
title_full De-duplication on the Backup System with Information Storage in a Database
title_fullStr De-duplication on the Backup System with Information Storage in a Database
title_full_unstemmed De-duplication on the Backup System with Information Storage in a Database
title_short De-duplication on the Backup System with Information Storage in a Database
title_sort de duplication on the backup system with information storage in a database
topic file
data
backup
de-duplication
database
url https://www.mais-journal.ru/jour/article/view/510
work_keys_str_mv AT sergeymtaranin deduplicationonthebackupsystemwithinformationstorageinadatabase