Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5

Asustor + Btrfs + R/W cache = data loss risk

#1
So, this is the 2nd time in 2 years that I have lost some important data on my Asustor AS6510T

I have 5x 18Tb drives in Raid 5 , Btrfs and 2x 1Tb SSD for caching

First time was in 2022 with during the Deadbolt attack, but the data loss was not due to Deadbolt (although a few less important files), but due to the ADM update to fix it :
- after the attack, the NAS hung for 30mins in the initialization phase, so I couldn't install the ADM update
- removing the SSDs made it start up with the new ADM but did not recognize my Volume 1 (corrupted)
- after trial and error, I put the SSDs back in and it suddenly recognized my Volume 1 again although some files were lost

Last week, I bought 2x extra 18Tb to increase my volume size and upgrade to Raid 6
- I did "safely remove cache" in ADM, but after 30mins, it was still suck on 0%
- So I had no other resort than trying a hard reboot (in hindsight I should have waited for hours, but who would expect a 1Tb SSD flush would take hours..) and that's when my Volume 1 was inaccessible
- I took the SSDs out, rebooted and back in and then it recognized my Volume 1
- I then safely remove my SSDs (this time it took 15mins)
- ADM started immediately a rebuild
- after the rebuild, I got a weird status (see attached pics) :
  • RAID Level RAID 5 : Drive 1, 2, 3, 4, 5, 5 (2 times 5!!)
  • RAID 5, data protection None (Faulty drives tolerated 0), even though I have 5x disks of 16.37 Tb (18Tb) and a total volume capacity of 65.47 Tb.

After the 4 day Quin Ming holidays, Asustor support will investigate my volume, see if any date is lost
In the meantime I bought 4 extra disks to copy all my data to.

Asustor clearly stated : Btrfs in combination with R/W cache can corrupt your disk if the SSDs are removed (or fail)
Now, that I would call a HUGE security risk and as a product manager myself I cannot understand why ADM does not warn the users at all or even better, would restrict the use of R/W cache in combination with Btrfs


After copying my data over, I will probably delete the current volume (crashed 2 times, don't trust it anymore) and create a clean
new volume

So my questions to the forum are :
- Should I use Ext4 instead instead of Btrfs ? Is it really safer ? Strangely enough, the same Asustor support told me Btrfs is fine and is actually a good fs. Really after what they said ??
- I will probably not use Cache (or at least not R/W cache) ever again...
- or should I just get rid of my Asustor and go back to Synology or build a NAS myself with TrueNAS ?

Any tips here ? people with same experiences ?

What would be the safest NAS (Synology, TrueNAS) and safest Filesystem ?
Plz, take into account mirroring is not an option with +60Tb data, but Raid 6 is


Attached Files Thumbnail(s)
       
Reply
#2
It's disheartening to hear about your data loss experiences with your Asustor NAS setup. Indeed, combining Btrfs with R/W cache on Asustor systems can pose a significant risk, as you've unfortunately experienced firsthand. While Asustor support may have touted Btrfs as a robust file system, the reality of potential data corruption when using R/W cache demands caution. Considering your situation, migrating to Ext4 could offer a safer alternative, as it doesn't involve the same risks associated with Btrfs and cache usage. Additionally, refraining from employing cache, especially R/W cache, seems prudent to mitigate any further data loss risks. As for your NAS choice, both Synology and TrueNAS offer compelling options. Synology's DS1821+ or DS3622xs+ could be worth considering for their robust hardware and reliable performance, while TrueNAS provides a more DIY approach if you're inclined to build your NAS solution. Ultimately, prioritizing data integrity and reliability should guide your decision-making process.
Reply
#3
(04-08-2024, 03:17 PM)ed Wrote: It's disheartening to hear about your data loss experiences with your Asustor NAS setup. Indeed, combining Btrfs with R/W cache on Asustor systems can pose a significant risk, as you've unfortunately experienced firsthand. While Asustor support may have touted Btrfs as a robust file system, the reality of potential data corruption when using R/W cache demands caution. Considering your situation, migrating to Ext4 could offer a safer alternative, as it doesn't involve the same risks associated with Btrfs and cache usage. Additionally, refraining from employing cache, especially R/W cache, seems prudent to mitigate any further data loss risks. As for your NAS choice, both Synology and TrueNAS offer compelling options. Synology's DS1821+ or DS3622xs+ could be worth considering for their robust hardware and reliable performance, while TrueNAS provides a more DIY approach if you're inclined to build your NAS solution. Ultimately, prioritizing data integrity and reliability should guide your decision-making process.

Hi Ed- I see this conversation is over a month old, but wanted to pull on the thread a little as I may be running at risk as well. I'm certainly not an expert as it pertains to NAS devices, but have always used btrfs, perhaps because the majority of my prior hardware hasĀ been Synology and they tend to push btrfs as well. My data isn't technically at risk as I maintain copies in triplicate, so losing a volume wouldn't be the end of the world, though it would be a pain to restore, and time consuming for sure.

I was unaware of risks associated with running a btrfs volume in combination with R/W cache. To be frank, it's a bit unclear what purpose the cache is really serving in my particular scenario. It's always at 100% utilization, but the hit rate hovers between 7-8% at any given time. It's always been a bit unclear what performance benefits I'm actually getting from having the cache in the first place. I've long suspected the reason for the low hit rate is that the vast majority of files stored on my NAS are in the 70-100GB range individually (it's all high bitrate 4K media content) and it's entirely random which file might be accessed at any given time. My assumption has always been that the cache would be filled very quickly under those conditions. The cache filled entirely as I was originally copying data to the NAS, which was a push of ~11TB that took over 24 hours to complete (limited by the GbE connection and disk speed of the Synology I was copying the data from).

After completing that initial copy, I have certainly enjoyed the benefits of 10Gb connectivity, both to the NAS, as well as the workstations that typically connect to it. While not perfectly consistent, it can come close to saturating a 10Gb connection when everything else in the path can handle it, at least for read operations.

I digress.

I resurrected this thread out of curiosity since I'm running a similar configuration to OP, and I'm wondering if it would make sense to rebuild the array as ext4 sooner rather than later to avoid potential data loss. Another thing that has always seemed odd, and please forgive my ignorance as I may have overlooked something along the way, but there doesn't seem to be a way (at least not that I have found) to "clear" the cache short unmounting and recreating the cache for the volume. I always figured there would be, minimally, a way to perform such an action, at least from the terminal over ssh, but if it does exist (it may) I haven't found it.

Anyhow, my takeaway from this thread is that btrfs combined with R/W cache is risky. Can't say that I understand the nuances of that or why, specifically, it's risky to do so. Regardless, if that is the case I may want to rethink my configuration to avoid future pain and suffering?

Ty... Aaron.
Reply


Forum Jump:


Users browsing this thread: 2 Guest(s)