While copying data from RDS to Redshift..
To avoid data loss, start the ‘Incremental copy template’ before the ‘Full copy’
A sample implementation can be,
Incremental copy scheduled start time – 1:50 PM
Full copy start time – 2:00 PM
A DB Insert – 2:10 PM
Full copy End Time – 4:00 PM
A DB Insert – 4:05 PM
Incremental copy First run – 4:10 PM
> In the above example, the contents of first DB Insert at 2:10 may or may not be included in FULL copy.
> Contents of the second insert will not be included in Full copy.
How to ensure that these new inserts will show up in Redshift database ?
> As the ‘Incremental copy template’ uses TIME SERIES scheduling, the actual ‘Incremental copy activity’ run wont start at scheduled start time(1:50), rather it will start and the end of scheduled start time(4:10). All the DB changes between ‘scheduled start date/time’ and ‘first run of the actual copy activity’ will be copied to redshift.
> So, the first incremental copy run will copy all new DB inserts between 1:50 PM and 4:10 PM to redshift. This includes the contents of two DB inserts which are happening during/after FULL copy activity.
Trackback from your site.