In most cases relational databases do not handle batch processing of large amounts of data very well, not the least of which is RDBMS logging requirements, which set a maximum limit on data changes in one transaction.
Moving lots of data in is an operational concern, so an operational approach is a good and logical choice.
Break your file up,externally to the DB (using command line tools), into many smaller files of say 10K each and load them as you would the single large file – keep the chunking logic outside the DB
This is supplemental to the function given in the question and answers next steps after the db is dumpable.
Your next steps should be:
dumpall and restore on a physically different system. The reason being at this point we don’t know what caused this and chances are not too bad that it might be hardware.
You need to take the old system down and run hardware diagnostics on it, looking for problems. You really want to find out what happened so you don’t run into it again. Of particular interest:
Double check ECC RAM and MCE logs
Look at all RAID arrays and their battery backups
CPUs and PSUs
If it were me I would also look at environmental variables such as AC in and datacenter temperature.
Go over your backup strategy. In particular look at PITR (and related utility pgbarman). Make sure you can recover from a similar situation in the future if you run into it.
Data corruption doesn’t just happen. In rare cases it can be caused by bugs in PostgreSQL, but in most cases it is due to your hardware or due to custom code you have running in the back-end. Narrowing down the cause and ensuring recoverability are critical going forward.
Assuming you aren’t running custom C code in your database, most likely your data corruption is due to something on the hardware
Viewing 3 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login here