Open In App

Data Replication in DBMS

Last Updated : 19 Jul, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Data Replication is the process of storing data in more than one site or node. It is useful in improving the availability of data. It is simply copying data from a database from one server to another server so that all the users can share the same data without any inconsistency. This results in a distributed database in which users can access data relevant to their tasks without interfering with the work of others.

Types of Data Replication

  • Transactional Replication : Transactional replication involves sending an initial full copy of the database to subscribers, followed by real time updates as changes occur at the publisher. Changes are replicated in the exact order, ensuring transactional consistency. It’s ideal for server to server environments requiring high accuracy and reliability.
  • Snapshot Replication : Snapshot replication copies and distributes data exactly as it exists at a specific point in time, without tracking ongoing changes. It sends the entire dataset to subscribers, making it suitable for infrequent data changes or initial synchronization. Though slower than transactional replication, it's simple and effective for periodic updates.
  • Merge Replication : Data from two or more databases is combined into a single database. Merge replication is the most complex type of replication because it allows both publisher and subscriber to independently make changes to the database. Merge replication is typically used in server to client environments. It allows changes to be sent from one publisher to multiple subscribers.
  • Master slave replication : This type of replication is possible in master slave architecture, one database server is designated as the master and one or more other servers are designated as slaves. The master server receives all the write operations and the slaves receive a copy of the data from the master.
  • Multi master replication : In this type of replication, all the servers involved in replication can receive write operations and all the updates made to any server will be replicated to all the other servers.
  • Peer to peer replication : In this type of replication, each server can act as both a master and a slave and the data is replicated between all the servers in a peer to peer fashion.
  • Single source replication : In this type of replication, a single source database is replicated to multiple target databases.

Replication Schemes

1. Full Replication

The most extreme case is replication of the whole database at every site in the distributed system. This will improve the availability of the system because the system can continue to operate as long as at least one site is up.

Data Replication
Data Replication

Advantages of full replication

  • High Availability of Data : Since every site holds a complete copy of the database, users can access data even if one or more sites fail.
  • Faster Query Execution : With data stored locally at every site, queries can be processed without waiting for responses from other locations. 
  • Improved Performance for Global Queries : In distributed systems, global queries that would normally fetch data from multiple sites can now be answered locally.

Disadvantages of full replication

  • Difficult to Achieve Concurrency : Managing concurrent access and updates across all replicas is complex and can lead to conflicts or inconsistencies.
  • Slow Update Process : A single update must be propagated to every replica across multiple databases, making the update process time consuming and resource intensive.
  • High Storage Costs : Replicating data at every site leads to duplication, significantly increasing storage requirements.
  • Complex Consistency Maintenance : Ensuring that all copies remain consistent demands advanced synchronization mechanisms, increasing system complexity.
  • Increased Update Overhead : Frequent updates across all nodes can lead to high communication and processing overhead, especially in large scale systems.
  • Expensive in Terms of Resources : Full replication consumes more bandwidth, computing power and administrative effort due to frequent synchronization and update propagation.
  • Risk of Redundant Data Movement : In case of improperly managed replication strategies, unnecessary data transfers may occur, impacting performance.
  • Performance Bottlenecks during Updates : Simultaneous updates across sites can temporarily slow down the system, affecting overall query performance.

2. No replication

No replication means that each piece of data is stored at only one location (or node) in the entire distributed database system. There are no duplicate copies of the data on other servers or sites.

Advantages of No replication

  • Reduced Concurrency Issues : Concurrency has been minimized as only one site to be updated.
  • Simplified Recovery : Only one site hence easy to recover data.

Disadvantages of No replication

  • Poor Data Availability : Poor availability of data as centralized server only has data.
  • Slower Query Execution : Slow down query execution as multiple clients accessing same server.

3. Partial replication

Partial replication means, some fragments are replicated whereas others are not. Only a subset of the database is replicated at each site. This reduces storage costs but requires careful planning to ensure data consistency.

Advantages of partial replication

  • Data-Centric Replication : Number of replicas created for a fragment directly depends upon the importance of data in that fragment.
  • Optimized System Architecture : Optimized architecture give advantages of both full replication and no replication scheme.

Disadvantages of partial replication

  • Complex Management and Configuration : Partial replication requires careful planning to decide what data to replicate, where to replicate it and how often.
  • Inconsistent Data Access : Since not all data is available at every site, global queries may need to fetch data from multiple locations.

Features of data replication are:

  • Increased Availability: Data replication can improve availability by providing multiple copies of the same data in different locations, which reduces the risk of data unavailability due to network or hardware failures.
  • Improved Performance: Replicated data can be accessed more quickly since it is available in multiple locations, which can help to reduce network latency and improve query performance.
  • Enhanced Scalability: Replication can improve scalability by distributing data across multiple nodes, which allows for increased processing power and improved performance.
  • Improved Fault Tolerance: By storing data redundantly in multiple locations, replication can improve fault tolerance by ensuring that data remains available even if a node or network fails.
  • Improved Data Locality: Replication can improve data locality by storing data close to the applications or users that need it, which can help to reduce network traffic and improve performance.
  • Simplified Backup and Recovery: Replication can simplify backup and recovery processes by providing multiple copies of the same data in different locations, which reduces the risk of data loss due to hardware or software failures.
  • Enhanced Disaster Recovery: Replication can improve disaster recovery capabilities by providing redundant copies of data in different geographic locations, which reduces the risk of data loss due to natural disasters or other events.

Advantages of data replication are:

  • Improved performance : As data can be read from a local copy of the data instead of a remote one.
  • Increased data availability : As copies of the data can be used in case of a failure of the primary database.
  • Improved scalability : As the load on the primary database can be reduced by reading data from the replicas.

Disadvantages of data replication are:

  • Increased complexity : As the replication process needs to be configured and maintained.
  • Increased risk of data inconsistencies : As data can be updated simultaneously on different replicas.
  • Increased storage and network usage : As multiple copies of the data need to be stored and transmitted.
  • Data replication : It is happens in various types of systems, such as online transaction processing systems, data warehousing systems and distributed systems.

Article Tags :
Practice Tags :

Similar Reads