Affects Version/s: Fedora 4.7.0, Fedora 4.7.4
Fix Version/s: None
During a batch load process, a set of 8 binaries (4 files each for two pages of a single item) were POSTed to fcrepo, recieved a "201 Created" response, and were subsequently PATCHed (on fcr:metadata) without exhibiting any errors. At the time of this load we were on 4.7.0, running in a production RHEL VM in our datacenter.
Subsequently (several months later), the binaries were determined to be missing from the underlying data store when GET requests on the binaries were found to return a "500 Internal Server Error" response. Requests to the fcr:metadata nodes were still returning the correct metadata for the files that were no longer there. By this time we had upgraded to 4.7.4.
Original load took place on February 27 (see attached log file for the request-response cycle when these were loaded).
Binaries were found to be missing on Aug. 18 (see attached log file from an OCR extraction process we were running to parse ALTO XML and create web annotation resources representing the OCR for the text blocks on each page).
Our working hypothesis is that perhaps the binaries were received by fcrepo without problems (leading to the positive responses and creation of fcr:metadata), but when they should have been copied to their final Modeshape storage location, something prevented that from happening. This is only a hypothesis. It seems that at a minimum we could investigate whether fcrepo could perform any additional checks to ensure that this copy operation is successful before responding with a 201.
Finally, it should be noted that we cannot be 100% certain that the files "never arrived at their final Modeshape location" (as stated in the title of this report), because we don't have any log files or backups that cover the period in question. It does seem the most likely explanation, however, since the only thing that connects these eight binaries (at the Modeshape level) is that they were all loaded at the same time.