Explaining CDC (Change Data Capture)

Pavol Z. Kutaj - Oct 11 - - Dev Community

def

  • around for 10-15 years or so but what is new is the demand for change
  • CDC is the technology that turns a database change log into a stream of events
  • kind of like a time series data, but with the change that happens to the database itself
  • every single write every, single insert update captured in the change log of the database
  • we have something that reads the database log
  • not the database directly — we're not querying the tables and using database resources that way
  • we're looking at the log file of the database
  • turning that into a stream of events that then you can do stuff with

use

  • backup and disaster recovery
  • you can create exact point in time replicas of the database
  • this that batch data loading misses is— changes that happen in between the frequencies of the snapshots right so if you're trying to do something like fraud detection or you're
  • trying to detect or train a machine learning model on on sort of real world data sets and data
  • things can happen at a higher frequency than your snapshot in the database and you'd miss that with change data capture you get all of that and there's tons of
. . . . . . . .