Skip to main content

3. Full Import and Streaming Incremental Synchronization from PostgreSQL to Klustron

KlustronAbout 3 min

3. Full Import and Streaming Incremental Synchronization from PostgreSQL to Klustron

01 Environment

Migrate business data from native PostgreSQL database to the Klustron cluster. Since Klustron is compatible with the PostgreSQL protocol, it can directly support PostgreSQL data import/export tools: pg_dump and pg_restore.

However, since Klustron currently only supports PostgreSQL version 11.5, for incompatible versions: newer version tools can be compatible with older versions, while older version tools do not support newer versions.

02 Steps for Data Export and Import

Due to the fact that Klustron requires tables to have primary keys when inserting data, to ensure the success of data migration, the process of exporting data from the native PostgreSQL database needs to be performed in two steps (first, export the data structure, and second, export the actual data).

During the recovery process, the data structure is restored first, and then the data is imported. Below are two scenarios demonstrating the specific operation instructions.

2.1 Export tables

2.1.1 Export table structure

pg_dump -h 192.168.0.142 -p 5432 -tcustomer1_3 -t customer1_4  postgres-s>customer1_34str.sql  (-s parameter is to export only the table structure)

2.1.2 Export table data

pg_dump -h 192.168.0.142 -p 5432 -tcustomer1_3 -t customer1_4  postgres  -a >customer1_34_data.sq

2.1.3 Restore table structure

psql --dbname=bktest3  --host=192.168.0.142  --port=5401 -U abc -f  customer1_34str.sql

2.1.4 Restore table data

psql --dbname=bktest3  --host=192.168.0.142  --port=5401 -U abc -f  customer1_34_data.sql

2.2 Export a full database

2.2.1 Export the structure of the full database

pg_dump -d postgres -U postgres  -s>postgres_str.sql

2.2.2 Export the structure of the full database

pg_dump -d postgres -U postgres  -a>postgres_data.sql

Alternatively, data can be exported in dump or tar format to reduce space and improve performance:

pg_dump -d postgres -U postgres  -a  -Fc >postgres_data.dump (尺寸最小)
pg_dump -d postgres -U postgres  -a  -Ft >postgres_data.tar

2.2.3 Restore the full database structure

psql --dbname=bktest2  --host=192.168.0.142  --port=5401 -U abc -f  postgres_str.sql

2.2.4 Restore the full database data

psql --dbname=bktest2  --host=192.168.0.142  --port=5401 -U abc -f  postgres_data.sql

If the data was exported in dump format, pg_restore should be used:

pg_restore -h 192.168.0.142 -p 5401 -U abc-d bktest2 postgres_data.dump

(Note that pg_restore can only be backwards compatible. Files exported by higher version pg_dump cannot be restored with lower version pg_restore.)

03 Oracle GoldenGate Real-time Synchronization from PostgreSQL to Klustron

3.1 Install and deploy Oracle GoldenGate

You need to download and install the OGG for PostgreSQL microservices version, oggs_Linux_x64_PostgreSQL_services_shiphome version. Follow the official installation manual to install it.

The installation and deployment architecture is shown in the following figure:

3.2 Configure PostgreSQL data source and Klustron target

Source database information: IP address 192.168.0.10, replication user: postgres, database name: PGSOUR

Target database information: IP address 192.168.0.12, replication user: postgres, database name: Klustron

Set the database parameters:

alter system set wal_level='logical';
alter system set max_replication_slots = 20;

Write OGG odbc.ini file:

[ODBC Data Sources] 
PGDSN1=DataDirect 14 PostgreSQL Wire Protocol 
PGDSN2=DataDirect 11 PostgreSQL Wire Protocol   

[ODBC] 
IANAAppCodePage=106 
InstallDir=/ogg21/ogg_ma 

[PGDSN1] 
Driver=/ogg21/ogg_ma/lib/GGpsql25.so 
Description=DataDirect 14 PostgreSQL Wire Protocol 
Database=PGSOUR
HostName=192.168.0.10
PortNumber=5432 
LogonID=postgres 
Password=Klustrontest

[PGDSN2] 
Driver=/ogg21/ogg_ma/lib/GGpsql25.so 
Description=DataDirect 11 PostgreSQL Wire Protocol 
Database=Klustron 
HostName=192.168.0.12 
PortNumber=5401 
LogonID=postgres 
Password=Klustrontest

3.3 Configure the replication environment

Create identity credentials and checkpoint table:

Login:

3.4 Configure and start the extract process

Extract process configuration file:

extract ext1 
SETENV(PGCLIENTENCODING = "UTF8" ) 
SETENV(ODBCINI="/ogg21/ogg_ma/odbc.ini" ) 
SOURCEDB PGDSN1 USERIDALIAS PG1, DOMAIN OGGMA 
exttrail ./dirdat/e1 
IGNOREREPLICATES 
TRANLOGOPTIONS FILTERTABLE ogg.ckpt 
table public.*;

3.5 Configure and start the replicat process

Replicat process configuration file:

replicat rep1 
SETENV(PGCLIENTENCODING = "UTF8" ) 
SETENV(ODBCINI="/ogg21/ogg_ma/odbc.ini" ) 
SETENV(NLS_LANG="AMERICAN_AMERICA.AL32UTF8") 
TARGETDB PGDSN2 USERIDALIAS PG2, DOMAIN OGGMA 
map public.*, target public.*;

3.6 Perform Synchronization and Check Synchronization

You can insert data into the source database and check the results in the target database:

You can also check the replication progress through the management interface:

3.7 Stress Testing

Stress testing includes two parts (due to the low configuration of the test virtual machine environment, the amount of data tested is relatively small).

On the source side, use sysbench to insert real-time data into PostgreSQL and check the update of data on the target side Klustron.

The target database runs under certain loads.

Execute data loading on the source side:

sysbench oltp_read_write --table-size=100000 --tables=10  --pgsql-host=192.168.0.10   --pgsql-port=5432  --pgsql-user=postgres  --pgsql-password=Klustrontest --pgsql-db=PGSOURE  --db-ps-mode=disable  --threads=1 --db-driver=pgsql   prepare

Target database executes stress testing scripts:

sysbench oltp_read_write --table-size=100 --tables=10  --pgsql-host=192.168.0.12   --pgsql-port=5401  --pgsql-user=abc  --pgsql-password=abc --pgsql-db=postgres  --db-ps-mode=disable  --threads=10 --db-driver=pgsql  --time=600 run

Target side checks data synchronization status:

select count(*) from sbtest10;
 count
-------
  100000
(1 row)

The test conclusion is that the real-time synchronization from PostgreSQL to Klustron using Oracle GoldenGate is successful and stable.

END