Wednesday, May 11, 2016 [Store:280019]There was an error while writing to a storage

The problem

We discovered that an Oracle® WebLogic managed server was in status FAILED on a development environment. The apparent reason for this was the following error: [Store:280019]There was an error while writing to a storage

The cause

The can be multiple reasons or the above error, but the logical guess was that our /data mount point had run out of disk space, because that mount point was where our domain configuration (and persistent stores) were saved. That theory turned out to be wrong. The /data mount point was only 12% utilized on a 100GB partition, so there was plenty of available disk.

Then I tried to create an empty file on the mount point:

# touch /data/dummy
# touch: cannot touch `/data/dummy': Read-only file system

When checking the /var/log/messages file, I found the reason for that:

May 11 07:37:45 <hostname> kernel: JBD2: I/O error detected when updating journal superblock for dm-25-8.
May 11 07:37:45 <hostname> kernel: EXT4-fs error (device dm-25): ext4_journal_start_sb: Detected aborted journal
May 11 07:37:45 <hostname> kernel: EXT4-fs (dm-25): Remounting filesystem read-only

Comparing the commands cat /etc/fstab and ls -l /dev/mapper/ | grep dm-25 also confirmed that device dm-25 was indeed the source for the /data mount point.

The solution

After shutting down everything as cleanly as I could, I checked if there were still processes trying to use the /data mount point:

lsof | grep "/data"

It turned out that there was a few processes I was unable to shutdown cleanly, so I had to kill those. 

Then I unmounted the /data mount point:

umount /data

Finally I fixed the issue by running fsck and mounting the partition again. 

Caution: Running fsck on a partition can in worst case scenario be destructive, so make sure you have a valid backup!

fsck /data
mount /data

Now I could start up Oracle® WebLogic Server without any issues.

No comments:

Post a Comment