OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

ARNiTECT

Weaksauce
Joined
Aug 4, 2012
Messages
65
Hi _Gea,
Do you still have a link to an ESXi 6.5 compatible Napp-it ToGo Ova template?
I used the workaround to upgrade a server from 6.5 to 6.7 with an unsupported CPU, adding allowLegacyCPU=true to ESXi boot.cfg and adding monitor.allowLegacyCPU=true parameter to each VM. However, it is really slow to boot each VM with this workaround, so it looks like I'll need to revert to 6.5 for this server.
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,068
I would not suggest to use an old template with an old OmniOS due the many bug and security fixes or newer features on a current OS
Instead install a current OmniOS 151038 lts or 040 stable:

- upload iso to your local datastore, https://omnios.org/download.html
- create a new vm (Solaris 11-64), 35 GB bootdisk, e1000 or faster vmxnet3 vnic
- boot the VM with the OmniOS iso and install OS (open-vmware tools with vmxnet3 are included)

- install napp-it
wget -O - https:// www.napp-it.org/nappit | perl

set a root pw
passwd root
 

ARNiTECT

Weaksauce
Joined
Aug 4, 2012
Messages
65
Ah, ok.
I like to update to latest OmniOS anyway.
I was just thinking there might be some under-the-hood tweaks in the ova template I might be missing out on.
I'll make one from scratch as you describe.
Thanks.
 

ARNiTECT

Weaksauce
Joined
Aug 4, 2012
Messages
65
I'm currently on OmniOS 038z, I noticed the 040 has "Improved NVMe Hot-plug support"

I'm once again revisiting issues on my main server (Supermicro X11SCA-F) with my 3x M.2 NVMe (Corsair MP510 NVMe 1920Gb) in a PCIe PLX card (Supermicro AOC-SHG3-4M2P) passed through to OmniOS.

The NVMe drives each work fine when passed through 'without' the PLX and straight into the board; lasting well over a year without error. However, when installed on the PLX and passed through to OmniOS, they initially appear fine, which lasts from only a few hours to over a week, before spitting out 100,000+ read errors and crashing. After the latest attempt, OmniOS won't even boot if I try to pass through the NVMe drive. OmniOS says:

OmniOS_wont_boot.jpg


I have to remove all the passed through PCI devices then re-add them one by one (HBA 9400-16i and on board SATA controller), but not the NVMe, or it won't boot.
This is the second AOC-SHG3-4M2P PLX card and second X11SCA-F motherboard, so I hope I can rule out a hardware fault.

Options:
- Is there a setting I am missing to allow these drives to work on this PLX card?
- I might try not passing through the NVMe drives and instead put vmdk datastores on them in ESXi for OmniOS to use. Is there anything I should be concerned about with this?
- I will also try my 'Tri-mode' Broadcom 9400-16i, as I understand it can accept 2x NVMe drives, as well as my existing 8x SATA drives.
- I'm aware of the Qnap QM2-4P-384, which also appears to be a PLX; but if my 9400-16i and AOC-SHG3-4M2P have problems, then I can't see the Qnap being any different.
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,068
NVMe passthrough is a very critical part of ESXi. Some configs work, other not.
In last case I would use the NVMe under ESXi and give vmfs vdisks to VMs

Main disadvantage is that all data go VM -> vdisk driver -> ESXi driver instead
VM -> native driver. Mostly this is acceptable and ESXi is known for a very good behaviour even with sync write.
 

ARNiTECT

Weaksauce
Joined
Aug 4, 2012
Messages
65
Thanks Gea,
I setup a Z1 of 3x full NVMe drive vmdk datastores and replicated my VMs back onto the pool, which seemed to go pretty fast. I have a couple of VMs running and the pool has no errors so far.
I'm tempted to try a risky 3x basic vdev pool configuration (with regular snap/replicate), would there be much difference in IOPS to Z1? I have a fairly fast CPU (Xeon E-2278G) 48Gb ram for OmniOS and 20Gb Optane Slog per pool.
 
Last edited:

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,068
Iops of a Raid-Z [1-3] is like a single disk while sequential performance of Raid-0/Z scale with number of datadisks.
So iops are quite the same while a Raid-0 may be faster sequentially (not relevant with NVMe)
 

ARNiTECT

Weaksauce
Joined
Aug 4, 2012
Messages
65
So, IOPS of Raid-0 is similar to Raid-Z like a single disk?
I thought IOPS of Raid-0 would scale with disks.
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,068
Raid-0 and Raid-Z stripe data over disks.
On access you must wait until any disk is ready.

Only sequential performance can scale as any disk must hold only a part of overall data.
 

taroumaru

Weaksauce
Joined
Dec 22, 2005
Messages
78
OmniOS r151040
napp-it LATEST

Hi _Gea

  1. Previously on SOLARIS 11.0/11.1 & napp-it 0.9+, I could create SMB only users from a menu in napp-it. I can't seem to locate this anymore! How do I do this again? Is it a limitation of OmniOS or newer versions of napp-it?
    1. SMB/NFS only users could only access network shares, but were unable to login at the console or SSH into the server. These users were also not part of any local user groups, I think. This was a very good security feature!
  2. On napp-it SMB menu, there's two SMB services that can be enabled, which one needs to be enabled to create shares so windows/linux or clients can access them?
    1. SMB Server Service
    2. SMB Client Service
  3. What is the max version of SMB that in kernel (illumos) CIFS supports?
Lastly, think I'm already starting to not liking OmniOS much! Where as SOLARIS has extensive documentations (even as downloadable PDFs), OmniOS is very sparse on these. I don't even know what latest ZFS & ZPool versions OmniOS supports. For that matter I can't find the list of features supported by OmniOS either.
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,068
1.
You cannot create SMB only users. Every user must be a regular Unix user. This is the case for Solaris and its forks. Only difference is the password. For Unix the pw hash is stored in /etc/shadow while the SMB password is in /var/smb/smbpasswd (different structure).

If you create a user in napp-it no shell is assigned to this user. This means you can login via SMB only but not to a shell.

2. If you want to access OmniOS via SMB from another computer you must enable the server service.If you want to access an SMB share on OmniOS from another server ex a Windows computer, you need the client service.

3. Currently SMB is 3.1.1.
see changelogs for newer versions ex
https://github.com/omniosorg/omnios-build/blob/r151038/doc/ReleaseNotes.md (current lts)
https://github.com/omniosorg/omnios-build/blob/r151040/doc/ReleaseNotes.md (current stable)

OmniOS ZFS v5000 /v5 is quite in sync with Open-ZFS. Supported/enabled features see napp-it menu Pools > Features

Currently the Solaris manuals are quite ok for the Solaris forks especially the older ones 11.2 and 11.3,
https://docs.oracle.com/en/operating-systems/solaris.html.

Beside the manuals at https://napp-it.org/manuals/index_en.html you find manuals at https://omnios.org/ (see menu Documentation) or https://www.openindiana.org/. Options of the two main ZFS executable see https://illumos.org/man/1M/zfs and https://illumos.org/man/1M/zpool
 

taroumaru

Weaksauce
Joined
Dec 22, 2005
Messages
78
_Gea napp-it can't execute this from the CMD?
Code:
smartctl -v

1.
You cannot create SMB only users. Every user must be a regular Unix user. This is the case for Solaris and its forks. Only difference is the password. For Unix the pw hash is stored in /etc/shadow while the SMB password is in /var/smb/smbpasswd (different structure).

If you create a user in napp-it no shell is assigned to this user. This means you can login via SMB only but not to a shell.
So it is still possible to create users that cannot logon at the console. Good to know.

2. If you want to access OmniOS via SMB from another computer you must enable the server service.If you want to access an SMB share on OmniOS from another server ex a Windows computer, you need the client service.
So let me see, if I got this right:
a] SMB Server = Windows 11 workstation -> OmniOS SMB share​
b] SMB Client = Windows Server 2022 -> OmniOS SMB share​

Did I get this right?

I know about all that, but it's still not as extensive & easily accessible as SOLARIS. For an example, OmniOS Release Notes don't include which latest ZFS version (5000) attributes are currently being supported.

OmniOS ZFS v5000 /v5 is quite in sync with Open-ZFS. Supported/enabled features see napp-it menu Pools > Features
Thanks, that was exactly what I was looking for.

Currently the Solaris manuals are quite ok for the Solaris forks especially the older ones 11.2 and 11.3,
https://docs.oracle.com/en/operating-systems/solaris.html.
This is actually what I use daily! OmniOS documentation is seriously not up to par with SOLARIS documentation.

Beside the manuals at https://napp-it.org/manuals/index_en.html you find manuals at https://omnios.org/ (see menu Documentation) or https://www.openindiana.org/. Options of the two main ZFS executable see https://illumos.org/man/1M/zfs and https://illumos.org/man/1M/zpool
Your site does have quite a lot of info. I've been using https://man.omnios.org/ when I can't make do with SOLARIS docs, but man.OmniOS.org doesn't even have a simple search function to search through all the documents!

Thanks _Gea, for letting me know that network login only user thing is still available. I think it used to look different back during the napp-it v0.8 & v0.9 era. And please let me know if I got the gist of the SMB server/client services right.
 
Last edited:

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,068
Extras from OmniOS or pkgsrc are installed under /opt to be OS independent, see
/opt/ooce/smartmontools/sbin/smartctl

SMB services
server: Mac/Windows client -> OmniOS server share
client: OmniOS as client -> Windows server share
 

taroumaru

Weaksauce
Joined
Dec 22, 2005
Messages
78
Extras from OmniOS or pkgsrc are installed under /opt to be OS independent, see
/opt/ooce/smartmontools/sbin/smartctl
OmniOS quirkiness continues!

So made links to the smartctl binary in /opt/ooce/bin (as that's included in $PATH variable). Guess what now, even more OmniOS shenanigans. From a terminal or SSH I can invoke smartctl, but when I create a Job in napp-it which starts smartctl, I get an error & the job doesn't complete. Turns out that in napp-it I have to provide full path of the binary, /opt/ooce/bin/name_of_binary. :facepalm:

I have run into many small things in napp-it so far of testing, but what I don't remember is if it was like this back in the days of v0.7/0.8 from back in 2011 (was there ever a v0.6?):

  1. Can't create rpool (root == /) snaps!
    1. The only options listed are rpool/home & rpool/home/user1
  2. Old version of napp-it couldn't create a Job that can run after fresh boot or a reboot. Is this still the case?
    1. Anyway to run a task after completion of boot/reboot?
    2. I can't recall what I did to make it work with crontab (@reboot doesn't work with OpenSOLARIS derivatives...
  3. TLS emailing still not working properly! Did everything to compile all the libraries, but when testing TLS I get the following error:
    1. Code:
      Auth failed: 535 5.7.8 Username and Password not accepted. Learn more at at /var/web-gui/data/napp-it/zfsos/15_Jobs and data services/04_TLS Email/09_TLS-test/action.pl line 80.

SMB services
server: Mac/Windows client -> OmniOS server share
client: OmniOS as client -> Windows server share
Thanks! Shortly after posting that message, while searching for online docs I stumbled on some old docs that explained this in some detail. Guess what, it was SOLARIS 11 Express & 11.2 documentation online (on crap-O-racle website of all places). That was absolutely hilarious.
 

taroumaru

Weaksauce
Joined
Dec 22, 2005
Messages
78
System: SAS 6.0 Gbps backplane connected to IBM M1015; HBA flashed to LSI v20.00.07.00-IT firmware.
Old: SOLARIS 11.2
OLD pool: formatted as ZPool v5 & ZFS v28
Currently: OmniOS ce 151040

Code:
zpool import
   pool: OLD
     id: 99
  state: UNAVAIL
 status: One or more devices contains corrupted data.
 action: The pool cannot be imported due to damaged devices or data.
   see: http://illumos.org/msg/ZFS-8000-5E
 config:

        OLD         UNAVAIL  insufficient replicas
          raidz2-0  UNAVAIL  insufficient replicas
            c0t1d0  UNAVAIL  corrupted data
            c0t2d0  UNAVAIL  corrupted data
            c0t3d0  UNAVAIL  corrupted data
            c0t4d0  UNAVAIL  corrupted data
            c0t5d0  UNAVAIL  corrupted data
            c0t6d0  UNAVAIL  corrupted data

zpool import -o readonly=on OLD
cannot import 'OLD': pool was previously in use from another system.
Last accessed by <unknown> (hostid=0) at Jul 2021
The pool can be imported, use 'zpool import -f' to import the pool.

zpool import -f -o readonly=on OLD
cannot import 'OLD': invalid vdev configuration

danswartz did you manage to fix & recover your pool with multiple disk showing 'corrupted data'?

P.S. wonder if I should post this on a separate thread?
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,068
Older napp-it installers compiled smartmontools from sources on OmniOS and Solaris. A newer napp-it installs smartmontools on OmniOS from the OmniOS repository (in /opt). The current napp-it can use smartmontools installed in /sbin or under /opt.

Can you reinstall Solaris 11.2 to check if the pool is importable there (pool ok) or not (pool damaged)?
Pool v28/5 should be compatible between Solaris and OpenZFS but have not tried for ages

You can start a script on bootup with Service > Bonjour and Autostart
or a job with minute=once and others=every

TLS
If you use Gmail you see this unless you allow "unsecure" apps in account settings
 
Last edited:

ARNiTECT

Weaksauce
Joined
Aug 4, 2012
Messages
65
Hi,

I would like our house to carry on functioning if my primary server goes down unexpectedly, or for maintenance/tinkering. I have looked into ESXi: vMotion, vSphere Replication and vSphere High Availability, but was wondering if there is a simpler option using OmniOS/napp-it instead.

I have 2x All-in-One ESXi/OmniOS hosts with Napp-it Home: a primary host for all my server/desktop VM needs and a secondary less capable host for redundancy (hardware specs at end of post). I would like to keep it to just 2 servers if possible.

A. I have 2x Domain Controllers (Windows Server 2019) and I would like at least one to be up and running at all times. Both DC's are on datastores and not on a ZFS filing system as I would prefer. This is because OmniOS seems to need the DC to be available when it starts, but only allows entering 1x DNS server IP. Could I set this up differently?

B. I would like to maintain access to important SMB folders, such as for Folder Redirection. Switching over manually would be ok, but automatically would be great. How best could I use features of OmniOS/napp-it to achieve this? Eg Replication, or HA?

C. Media Server (Windows 10 Pro) VM stored on an NFS share on the primary host. This includes: Squeezebox LMS to keep our devices working with music streaming services, even if we temporarily can't access our main music SMB folder; and EMBY media server, to allow network TV tuner recordings to continue, new recordings stored temporarily while the main server is off, we can temporarily lose access to previous recordings.

D. I find vCenter Server Appliance a useful tool, but I'm not sure where best to install this (primary/secondary host, datastore/ZFS), but I would like this to be available when the other host is down.

//Host 1 - X11SCA-F, E2278G 8-core, 128Gb ram, 3x GPU
32Gb USB (ESXi 7)
Optane 900p 280GB datastore (OmniOS, 2x Slog, VMs)
1TB SSD datastore (scratch / general use)
ZPool-1: 3x 2TB NVMe vmdk, Z1 (VMs, fast access files)
ZPool-2: 6x 8TB HDD pass, Raid10 (Main Filer)
ZPool-3: 8x 3TB HDD pass, Z2 (Replication job to Filer and NVMe pools)
ZPool-4/5: 4x 5TB removable 2.5" HDD (External backup, manual replication)

//Host 2 - X9SCL-F, E3-1275v2 4-core, 32Gb ram
32Gb USB (ESXi 7)
60GB SSD datastore (OmniOS)
500GB SSD datastore (scratch / general use)
ZPool-1: 4x 250GB SSD Raid10 (VMs) with 24GB SSD (Slog)
ZPool-2: 2x 2TB HDD (other, to be decided)
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,068
A/B: multiple DNS, set at System > Network Eth > Dns
When you join an AD, you must use the AD for DNS and the AD must be online all the time

You can SMB access with a local user (even if AD is off). To use AD again after being offline/online, you must restart SMB or rejoin.

To access data if filer is offline:
- replicate data to second filer, share SMB: either use readonly or you must replicate newest state back
- HA is an option (two heads to a common multipath SAS storage maybe 'overkill' at home), http://www.napp-it.org/doc/downloads/z-raid.pdf

C
Always on requires a HA setup. At home you have control about downtime so maybe keep it simple is more important. Use local datastore for critical VMs. Backup them in offline state as a template (to recover within minutes). For storage use a minimalistic VM with storage only.

D
Vcenter is a "on demand" tool, must not run all the time
 

danswartz

2[H]4U
Joined
Feb 25, 2011
Messages
3,710
zpool import -f -o readonly=on OLD
cannot import 'OLD': invalid vdev configuration[/code]

danswartz did you manage to fix & recover your pool with multiple disk showing 'corrupted data'?


sorry just now noticed this. No, never figured it out. I took an external 1TB spinner, formatted as v28 with omnios and pushed everything over while still on linux.
 

ARNiTECT

Weaksauce
Joined
Aug 4, 2012
Messages
65
Thanks Gea,

A/B: multiple DNS, set at System > Network Eth > Dns
When you join an AD, you must use the AD for DNS and the AD must be online all the time
You can SMB access with a local user (even if AD is off). To use AD again after being offline/online, you must restart SMB or rejoin.
Ah yes, it is SMB I am having issues with, SMB > Active Directory > Secondary DC (Solaris 11 only).

To access data if filer is offline:
- replicate data to second filer, share SMB: either use readonly or you must replicate newest state back
- HA is an option (two heads to a common multipath SAS storage maybe 'overkill' at home), http://www.napp-it.org/doc/downloads/z-raid.pdf
I read through the document and I agree Clusters etc looks interesting, but a bit overkill for me.
It looks like best option is: '1.2 Improved Availability Level2 (second standby server with async replication)'

C: Always on requires a HA setup. At home you have control about downtime so maybe keep it simple is more important. Use local datastore for critical VMs. Backup them in offline state as a template (to recover within minutes). For storage use a minimalistic VM with storage only.
Yes, I currently shutdown datastore VMs to clone to template, which takes a while. Although less efficient with memory, I was wondering if having 2 storage VMs on each host would work? eg:
//Host 1 (128Gb):
1st - OmniOS napp-it Free (4Gb): additional vmdk on Optane, NFS share for boot VMs (DC1 & UPS etc.), not connected to active directory, Replication source
2nd - OmniOS napp-it Home (48Gb): VM & SMB use, connected to active directory, Replication source & target
//Host 2 (32Gb):
1st - OmniOS napp-it Free (4Gb): NFS share for VMs, not connected to active directory, Replication source
2nd - OmniOS napp-it Home (8Gb): SMB shares, Replication target

D: Vcenter is a "on demand" tool, must not run all the time
I will set to manual replicate VM before maintenance, so I can use it on either host for cloning when datastore VMs are shut down. I just need to ensure I have enough memory left to boot it when required.
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,068
I am not sure if secondary AD support is still working on current Solaris as all my machines are now OmniOS.
On OmniOS you can only join one AD. If you loose AD connectivity you must rejoin or restart SMB
 

ARNiTECT

Weaksauce
Joined
Aug 4, 2012
Messages
65
I am not sure if secondary AD support is still working on current Solaris as all my machines are now OmniOS.
On OmniOS you can only join one AD. If you loose AD connectivity you must rejoin or restart SMB
Yes, I've resigned myself to this.

I have now setup a VM boot order: ESXi > OmniOS 1 (local) > Windows Server > OmniOS 2 (AD) > Other VMs
This seems to be working well.

Next I want to setup the Replications:

//Host 1 - Primary pool, to: Host 1 - Disaster recovery pool
To change an existing Replication job from: send newest data (-i), to: send all intermediary snapshots and clones (-I), can I just change the job property and on next run, the Replicated target will update with all previously excluded snaps, or do I need to delete the (-i) Replications and create the jobs again with (-I)?

//Host 1 - Primary pool, to: Host 2 - Secondary pool
If I Replicate from source: Primary which includes snapshots, using send newest data only (-i) to target: Secondary, turn Primary off and use Secondary, and then later Replicate back again source: Secondary / target: Primary, using: send all intermediary snapshots and clones (-I), will the Primary Pool keep all of its historical snaps and just be updated with the new snaps from its time in the Secondary pool? or will I lose the Primary historical snaps?
 
Last edited:

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,068
Replication -I transfers all intermediate snaps on next replication run. Avoid to delete older replication snaps from a former run as you need at least one common snap pair to continue a ZFS replication.

Prior an incremental replication the target filesystem does a rollback to the common snap number. All newer snaps will be destroyed then. If you want snaps on a destination target, you must use the keep/hold settings from replication job settings. This is independent from regular autosnap jobs.

To switch active filer from 1 to 2:
- disable replication, set filesystem on 2 to rw and enable SMB

To switch back:
- replicate newest state back to 1 (create a job on 1 with same jobid to do this based on last snap pair),
- restart replication 1->2
 

ARNiTECT

Weaksauce
Joined
Aug 4, 2012
Messages
65
Thanks Gea,
I will set up a few test runs with this, as there seem to be a few opportunities for accidents
 

ARNiTECT

Weaksauce
Joined
Aug 4, 2012
Messages
65
//Host 1 - Primary pool, to: Host 1 - Disaster recovery pool
To change an existing Replication job from: send newest data (-i), to: send all intermediary snapshots and clones (-I), can I just change the job property and on next run, the Replicated target will update with all previously excluded snaps, or do I need to delete the (-i) Replications and create the jobs again with (-I)?

Replication -I transfers all intermediate snaps on next replication run. Avoid to delete older replication snaps from a former run as you need at least one common snap pair to continue a ZFS replication.
I have just tested upgrading a Replication job from -i to -I and the target ZFS FS is only showing intermediary snaps from the point the Replication job was changed from -i to -I and is not showing older source snaps from before the change (Windows previous versions).
I then tried creating a new Replication job as -I and this also does not show the source's previous snaps.
Is it possible to replicate the source's previous snaps?

Update: I'm getting there, I have just created a new job using the send option -R and this has replicated all of the source's old snaps to the target. But this would mean having to stop existing replication job, create a new one, check new target has completed ok, delete old target.
- Is there a way to backdate -R to existing replication jobs?
- once the initial -R job has completed, should -R option be removed and changed to -I?
 
Last edited:

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,068
I have just tested upgrading a Replication job from -i to -I and the target ZFS FS is only showing intermediary snaps from the point the Replication job was changed from -i to -I and is not showing older source snaps from before the change (Windows previous versions).
I then tried creating a new Replication job as -I and this also does not show the source's previous snaps.
Is it possible to replicate the source's previous snaps?

Update: I'm getting there, I have just created a new job using the send option -R and this has replicated all of the source's old snaps to the target. But this would mean having to stop existing replication job, create a new one, check new target has completed ok, delete old target.
- Is there a way to backdate -R to existing replication jobs?
- once the initial -R job has completed, should -R option be removed and changed to -I?

You can use any snap for an initial full replication. For ongoing incremental replications you need common snap pairs for a target rollback.

You cannot switch from -i/I to -R on incremental replications as you lack the common snappairs for daughter filesystems/zvols.

As the target filesystem does a rollback on i/I to the common base snap, all newer snaps are destroyed, so you cannot preserve them so -R is the only option to replicate all datasets (filesystems, snaps, zvols). A switch from -R to -i is possible but only for a single filesystem. You should propably destroy the replication snaps with same jobid for other filesystems. In general if you want more snaps on a replication target, use replication settings keep and hold.
 

ARNiTECT

Weaksauce
Joined
Aug 4, 2012
Messages
65
You can use any snap for an initial full replication. For ongoing incremental replications you need common snap pairs for a target rollback.
You cannot switch from -i/I to -R on incremental replications as you lack the common snappairs for daughter filesystems/zvols.
As the target filesystem does a rollback on i/I to the common base snap, all newer snaps are destroyed, so you cannot preserve them so -R is the only option to replicate all datasets (filesystems, snaps, zvols).
Ah, ok. If I want the old snaps I will have to recreate the replication jobs using -R.

A switch from -R to -i is possible but only for a single filesystem.
For my Host1-poolA to Host1-poolB backup replication jobs (with old snaps), I understand I should set the replication for each single ZFS filesystem with -R and -I.
After the first successful run, should I remove -R from each single filesystem and just keep -I to replicate all future intermediate snaps, or does it not matter?

You should probably destroy the replication snaps with same jobid for other filesystems. In general if you want more snaps on a replication target, use replication settings keep and hold.
If all source snaps, old and new, are replicated with -R and -I, then I don't think I would need to keep more snaps on the target end; perhaps I should just keep the last 2, in case of replication problem?


For my Host1 to Host2 redundancy replication jobs, I planned to set the new replications for -i only, to minimise storage use on my secondary host.
Here is my plan, based on your previous suggestion:

To switch from Host1 to Host2:
on Host2 (target):
- create replication jobs to Host1 with -i
- run new jobs and confirm successful
on Host1 (source):
- backup to removable drives
- shutdown OmniOS, or just disable NFS and SMB services
on Host2 (target):
- disable new replication jobs
- set filesystems to read/write
- create & activate autosnap jobs
- enable NFS & SMB for each filesystem
- power on VMs in ESXi
- update DFS Namespace with any required additional targets

To switch back from Host2 to Host1:
on Host2 (source):
- shutdown VMs
- disable autosnap jobs
- disable NFS & SMB shares
on Host1 (target):
- create replication jobs to Host2 filesystems with same jobid (use Force jobid?) and with -I to replicate all new intermediate snaps
- run new jobs once and confirm successful
- Host1-poolA to Host1-poolB backup replication jobs should continue as usual with -I
on Host2 (source):
- re-enable previously created replication jobs to Host1 with -i ready for next time


I hope this works, as no one in the house is happy when I shutdown the server for 'maintenance' :)
 

ARNiTECT

Weaksauce
Joined
Aug 4, 2012
Messages
65
My tests seem to have worked as expected; I have noticed the following though:

After the first run of a replication with -R send option, the source filesystem SMB share is no longer accessible on the network. The source SMB share name is still visible in the list under ZFS filesystems. The new target SMB share is accessible (I have set a different share name to source) and no other filesystems seem to be affected. To regain access, I restart the SMB service, or unshare/reshare the source filesystem. Further replication runs (-i or -I) do not affect the source/target shares and they all remain accessible, so once all replications have completed their first run, this hopefully won't be an issue.

I noticed replications are adding additional users to the SMB share that are not on the source filesystem:
- Current Owner (ZFSserver\Current Owner)
- Current Group (ZFSserver\Current Group)
- Everyone
These are not inherited by the files/folders below.
I can delete them off the source filesystem, but they continue to be added to the target replications.

After an initial replication run, removing the -R and keeping -I doesn't seem to make any difference to the replicated target filesystem compared to keeping the -R and -I.

For host to host reverse replications, I create a new replication job with 'Force jobid' and this works as expected.
For reverse replications on the same host, a new job using 'Force jobid' removes the initial job. I assume this is by design as a jobid must be unique per OS. Can I get around this?
 
Last edited:

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,068
For the ACL settings you can check aclinherit settings of the filesystem or its parent.

If you remove -R only the filesystem is replicated not the ones below (-I keeps the datasets between source base-snap and next incremental source snap)

Job-IDs must be unique
 

taroumaru

Weaksauce
Joined
Dec 22, 2005
Messages
78
Can you reinstall Solaris 11.2 to check if the pool is importable there (pool ok) or not (pool damaged)?
Sorry, it looks like I gave you the wrong version number, I had 11.3 installed (was able to read the boot environment menu data from the old SOLARIS boot drive to figure this out).

Think I upgraded from 11.2 to 11.3 back in 2014 or maybe 2015, that's almost 7-8 years ago! I had completely forgot what version the ZFS was even running, as I never actively administered that server, other than the initial setup more than 10/11 years ago, and the few .X upgrades I had to do. (As I explained in another post before: https://hardforum.com/threads/opens...a-solaris-and-napp-it.1573272/post-1044185771 )

To answer your question, the short version: NO

Long answer: Yes; but after I changed the MB, CPU, RAM, HBA & boot drive. Installed a fresh copy of 11.3 (which I finally found online after searching for a few days). But mainly because, I was able to import the pool to 11.3 & then export it properly, booting into OmniOS, was finally allowed to import this pool (so it was not because of any hardware changes).
  1. Still don't understand why OmniOS couldn't import the pool!
  2. And I still have no idea what went wrong with the previous system (I just hadn't had the time to run lengthy diagnostics on the hardware; during this holiday break, I finally had the time & got the old_pool imported)
    1. So that the problematic hardware components can be identified, if any, that caused the system hang/stop working

You can start a script on bootup with Service > Bonjour and Autostart
or a job with minute=once and others=every
When I was finally able to import my old pool into OmniOS & started to setup jobs a few days, I kinda stubled upon this myself. Thanks for including this, this is a great lifesaver.

TLS
If you use Gmail you see this unless you allow "unsecure" apps in account settings
I did this but it still doesn't work. I will revisit this issue at a future time, because currently I have a more pressing things I need to be doing on OmniOS:
  1. My carefully crafted UID/GID & file/folder permissions are completely hosed!
    1. This is why I avoided transitioning to another OSol derivative since SOLARIS v11.2
    2. Even a few years ago, though I asked you about transitioning to OmniOS, I avoided it like a plague (COVID-19?)
      1. As posted here: https://hardforum.com/threads/opens...a-solaris-and-napp-it.1573272/post-1044183821
  2. Create new pool, transfer data from old pool
  3. New pool SMB sharing (*See the code snippet below):
    1. Can the pool be shared without a share name for the root '/', so that only the subfolders show up? (already was unable to do this when I tried, but not sure if I missed something)
      1. instead of this: \\SERVER\POOL_root\folder1/folder2/folder3
        1. this: \\SERVER\folder1/folder2/folder3
    2. Possible to share some subfolders from the new pool, with read-only permission & Guest access (no login)
      1. A] root of the pool shared (with or without a root share name), which will expose all the subfolders with R/W permission
        1. A-1] R/W: \\SERVER\POOL_root\folder* or \\SERVER\folder*
      2. B] subfolder with R/O permission: \\SOLARIS\folder3a
      3. C] subfolder R/O & Guest access: \\SOLARIS\folder3b-2
  4. Install ups drivers & set it up properly (probably the easiest of the bunch)
*I do have some questions about #3-1 & #3-2:
/ ├───folder1 ├───folder2 └───folder3 ├───folder3a └───folder3b ├───folder3b-1 └───folder3b-2
  • 3-1] SOLARIS didn't allow subfolders to be shared individually, I hope this is still not the case with OmniOS
  • 3-2] Not possible in SOLARIS, but hoping it all changed in OmniOS
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,068
Pool move
A pool move is possible between Oracle Solaris and Open-ZFS with pool v28/5.
Pool versions > v28 or zfs Solaris v6 is incompatible. If pool is not exported properly prior a move, you need zpool import -f poolname

Permissions
Sun did its best to be as Windows ntfs compatible as possible. As a result the Solarish SMB server permission model is ACL only with Windows SID as reference. As ZFS is a Unix filesystem, uid/gid is despite needed. The SIDis stored as an extended attribute and as ex AD name is part of SID, permissions stay intact if you move a pool ex from an AD member server to another without the need of a complex uid mapping. But as traditional Unix permissions like 755 cover only a small part of ntfs options (especially inheritance and detailled permissions are missing) and as ZFS must respect ACL and traditional Inix permissions, you get problems when you set traditional Unix permissions as this deletes all inheritance settings and reduce ACL to a state that is covered by Unix permissions. You should never do this with Solaris/OmniOS SMB. Always use ACL only.

If you reinstall the OS and want to preserve ACL settings, keep AD or hostname and use the same local user with same uid/gid.

Shares
On Solarish, a SMB share is a strict property of a ZFS filesystem. This is different to SAMBA that shares any folder in a filesystem (knows nothng about ZFS filesystems). This has the huge advantage that ZFS snaps=Windows previous versions always works as snaps are also a strict property of a ZFS filesystems while on SAMBA you must care about a correct setup. If you nest ZFS filesystems (not regular folders) Oracle Solaris does not allow to traverse to nested filesystems. This is bad regarding usability but avoids many problems as nested filesystems can have their own and different ZFS properties like snaps, upper/lower case naming restrictions, ZFS aclinherit properties or character sets.

\\SERVER\folder1/folder2/folder3 should be:
folder1= a shared ZFS filesystem on SERVER
folder2 or folder3= regular folders (not filesystems)

On OmniOS you can allow traversing to nested filesystems in Service > SMB > properties but must be aware of possible problems due different ZFS properties. In general, I would avoid nested filesystems at all and share a filesystem with only regular folders below (Be always aware of the difference of a folder and a ZFS filesystem even when they seem the same on a file listing)

SMB share settings on Solarish are per filesystem and cannot be different on a regular folder below.
What you can do is restrict access via ACL and hide files/folders where you have no permissions (enable ABE when sharing a filesystem)
 
Last edited:

taroumaru

Weaksauce
Joined
Dec 22, 2005
Messages
78
Pool move
A pool move is possible between Oracle Solaris and Open-ZFS with pool v28/5.
Pool versions > v28 or zfs Solaris v6 is incompatible. If pool is not exported properly prior a move, you need zpool import -f poolname
This is why I was so confused, I issued 'zpool import -f', but still OmniOS couldn't import the pool; as mentioned here earlier: https://hardforum.com/threads/opens...a-solaris-and-napp-it.1573272/post-1045210090.

Permissions
Sun did its best to be as Windows ntfs compatible as possible. As a result the Solarish SMB server permission model is ACL only with Windows SID as reference. As ZFS is a Unix filesystem, uid/gid is despite needed. The SIDis stored as an extended attribute and as ex AD name is part of SID, permissions stay intact if you move a pool ex from an AD member server to another without the need of a complex uid mapping. But as traditional Unix permissions like 755 cover only a small part of ntfs options (especially inheritance and detailled permissions are missing) and as ZFS must respect ACL and traditional Inix permissions, you get problems when you set traditional Unix permissions as this deletes all inheritance settings and reduce ACL to a state that is covered by Unix permissions. You should never do this with Solaris/OmniOS SMB. Always use ACL only.

If you reinstall the OS and want to preserve ACL settings, keep AD or hostname and use the same local user with same uid/gid.
.......
SMB share settings on Solarish are per filesystem and cannot be different on a regular folder below.
What you can do is restrict access via ACL and hide files/folders where you have no permissions (enable ABE when sharing a filesystem)
Not only UID/GID, but SID/ACL is completely messed up too! I think it's because when users were recreated on OmniOS, user ID# has changed; e.g. on SOLARIS User1 was 40001, but on OmniOS the same user is 40016.

Given that I probably have a million or more files (and possibly tens of thousands of sub-folders), it's not possible to reset UID, GID, SID/ACL every time I upgrade or install a fresh copy of the OS & need to create users again. Just wish SOLARIS, especially Open Solaris derivatives, would pay more attention to this problem and create a more manageable solution.

Shares
On Solarish, a SMB share is a strict property of a ZFS filesystem. This is different to SAMBA that shares any folder in a filesystem (knows nothng about ZFS filesystems). This has the huge advantage that ZFS snaps=Windows previous versions always works as snaps are also a strict property of a ZFS filesystems while on SAMBA you must care about a correct setup. If you nest ZFS filesystems (not regular folders) Oracle Solaris does not allow to traverse to nested filesystems. This is bad regarding usability but avoids many problems as nested filesystems can have their own and different ZFS properties like snaps, upper/lower case naming restrictions, ZFS aclinherit properties or character sets.
.......
On OmniOS you can allow traversing to nested filesystems in Service > SMB > properties but must be aware of possible problems due different ZFS properties. In general, I would avoid nested filesystems at all and share a filesystem with only regular folders below (Be always aware of the difference of a folder and a ZFS filesystem even when they seem the same on a file listing)
Performance, simplicity in creating/maintaining/administering the pool/folders/files & also creating/maintaining permissions is the reason I went with a flat top level directory structure when I first created this pool some 11-12 years ago, and completely avoided multiple ZFS filesystems at the pool '/' (root).

/ ├───Fs1 ├───Fs2 └───Fs3

If I created multiple ZFS file systems as top level folders on the '/' (root) of a pool (named Fs1, Fs2, Fs3), I'd have to create multiple snapshot jobs to create/delete/maintain these, making the whole thing very complex. Yes, I am aware of '-r' option to recursively create descendant snapshots, but it'd still be a huge mess. Not to mention replication and UID/GID/SID/ACL permissions.

\\SERVER\folder1/folder2/folder3 should be:
folder1= a shared ZFS filesystem on SERVER
folder2 or folder3= regular folders (not filesystems)
I think you misunderstood what I wanted to do; so let me try again.

Currently I have a pool that looks like this:
/ ├───folder1 ├───folder2 └───folder3

Currently when I share this pool I must provide a top level SMB share name, so the shared directory structure looks like this:
/ ( \\SERVER\SMB_SHARE_NAME\ ) ├───folder1 ( \\SERVER\SMB_SHARE_NAME\folder1 ) ├───folder2 ( \\SERVER\SMB_SHARE_NAME\folder2 ) └───folder3 ( \\SERVER\SMB_SHARE_NAME\folder3 )

What I want to know is if I can share the pool without a SMB share name, so that the directory structure would look something like this:
/ ( \\SERVER\ ) ├───folder1 ( \\SERVER\folder1 ) ├───folder2 ( \\SERVER\folder2 ) └───folder3 ( \\SERVER\folder3 )
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,068
This is why I was so confused, I issued 'zpool import -f', but still OmniOS couldn't import the pool; as mentioned here earlier: https://hardforum.com/threads/opens...a-solaris-and-napp-it.1573272/post-1045210090.

Yes I am too. For a pure 28/5 pool this should work

Not only UID/GID, but SID/ACL is completely messed up too! I think it's because when users were recreated on OmniOS, user ID# has changed; e.g. on SOLARIS User1 was 40001, but on OmniOS the same user is 40016.

Given that I probably have a million or more files (and possibly tens of thousands of sub-folders), it's not possible to reset UID, GID, SID/ACL every time I upgrade or install a fresh copy of the OS & need to create users again. Just wish SOLARIS, especially Open Solaris derivatives, would pay more attention to this problem and create a more manageable solution.

In a "production" environment you would use Windows Active Directory for user management. In such a case the Windows SID remains always the same as the Solarish SMB server use the real AD SID as reference for permissions. If you import a pool then on another AD member server, all permissions remain the sane. In workgroup mode and local users the Windows SID is based on uid. This means that when you reinstall Solarish with permissions intact you must recreate all users with former uid/gid or you must reset all permissions. This is then similar to a Windows ntfs disk that you move to a different computer. All users are then unknown.

With many local users you can either backup/restore Unix password files in /etc and Windows password files in /var/smb. You can also process a usual disaster recovery process when you replicate the current BE to your datapool. To recover install a minimal Solarish, restore the BE, activate and reboot into. With Napp-it you can use BEs as source or target of replication jobs.

Performance, simplicity in creating/maintaining/administering the pool/folders/files & also creating/maintaining permissions is the reason I went with a flat top level directory structure when I first created this pool some 11-12 years ago, and completely avoided multiple ZFS filesystems at the pool '/' (root).

/ ├───Fs1 ├───Fs2 └───Fs3

If I created multiple ZFS file systems as top level folders on the '/' (root) of a pool (named Fs1, Fs2, Fs3), I'd have to create multiple snapshot jobs to create/delete/maintain these, making the whole thing very complex. Yes, I am aware of '-r' option to recursively create descendant snapshots, but it'd still be a huge mess. Not to mention replication and UID/GID/SID/ACL permissions.

This is the way Sun intended using ZFS when they developped it and is "best use case". Use autosnaps recursively or create a job per filesystem if you want a different snap history per filesystem. For replications create a job per filesysstem.

With up to say a dozen of filesysstems I have or see no problem with this. May be different with hundreds of users and the goal to use a filesysten per user. But This is a layout I would not prefer.


I think you misunderstood what I wanted to do; so let me try again.

Currently I have a pool that looks like this:
/ ├───folder1 ├───folder2 └───folder3

Currently when I share this pool I must provide a top level SMB share name, so the shared directory structure looks like this:
/ ( \\SERVER\SMB_SHARE_NAME\ ) ├───folder1 ( \\SERVER\SMB_SHARE_NAME\folder1 ) ├───folder2 ( \\SERVER\SMB_SHARE_NAME\folder2 ) └───folder3 ( \\SERVER\SMB_SHARE_NAME\folder3 )

What I want to know is if I can share the pool without a SMB share name, so that the directory structure would look something like this:
/ ( \\SERVER\ ) ├───folder1 ( \\SERVER\folder1 ) ├───folder2 ( \\SERVER\folder2 ) └───folder3 ( \\SERVER\folder3 )

As the pool itself is a ZFS filesystem this would be possible if you enable SMB on the root filesystem with only simple folders below. But as I said, this is not "best use case" and you possibly create more problems than this solves. For example you cannot replicate a pool to the top level of another pool and you cannot use or modify different ZFS properties per use case.

Create one or a few filesystem, share them.

pool/
├───Fs1=share 1
-------- folder 1
-------- folder 2

├───Fs2=share 2
-------- folder 1
-------- folder 2

Is indeed the ZFS layout you should use. Keep everything simple, use ZFS as intended.
From a Client view when you connect the server you will see share1 and share2
 

ARNiTECT

Weaksauce
Joined
Aug 4, 2012
Messages
65
Hi,

I am currently setting up 10G and I am struggling to get near 10G speeds to/from OmniOS AiO ESXi VM over SMB or NFS.

I apologise for such a long post!
tl;dr:
- File transfer speeds r/w only 250MB/s to all pools, as below in bold;
- iperf3 shows 10G between physical nics and 14G virtual;
- adding more vCPU increases speed
- server hardware a bit stretched as end of post, but should have sufficient bandwidth for each device
- included benchmarks at bottom of post

Edit: I'll keep adding/updating information in this post as I experiment, so please check post is current.

Workstation has X550-T2 (latest firmware) on Windows 10 pro 21H2
Server has X710-DA2 (latest firmware) on ESXi7-U2
OmniOS VM#1 (esxi hw v11) r151040l with napp-it 21.06 Home 2x vCPU 48GB memory (main AiO for VMs and Filer)
OmniOS VM#2 (esxi hw v13) r151038x with napp-it 21.06 Free 1x vCPU 3GB memory (for single VM only)
OmniOS nic VMXNET3 is MTU 1500, no jumboframes
ESXi vSwitch for VMs has one X710-DA2 port and is separate from management port on different 1G controller
sync is disabled on the shares being tested
set 'base tuning option' under System > Appliance Tuning
Nothing else is running on server during tests

I have tested using iperf3 and I am getting 10G speeds as expected between the following:
Workstation and OmniOS (10G each way)
Workstation and ESXi Win10 VM on OmniOS NFS share (10G each way)
Workstation and ESXi Win10 VM on local SSD datastore MX500 1TB (10G each way)
ESXi Win10 VM on local SSD to OmniOS (14G each way)
ESXi Win10 VM on OmniOS NFS share to OmniOS (14G each way)

File transfer (1x 9GB file, or 4x 1GB on Optane 905P) I get the following speeds:
from windows workstation to an ESXi Win10 VM stored on local SSD over SMB: I get about 350MB/s each way (spikes up to 450MB/s), which is close to single SSD speed
> from windows 10 workstation to an OmniOS share over SMB: I get a steady 250MB/s; using NFS windows client, I get up to 350MB/s.
This is to/from any of the following OmniOS#1 Pools, all with sync disabled (however, sync enabled produces similar results):
1 - 3x 2TB MP510 NVMe drives Z1 (27% full) no log drive
2 - 6x 8TB HDD raid10 (28% full) with log (20GB vmdk on Optane 900P)
3 - 8x 3TB HDD Z2 (42% full) with log (20GB vmdk on Optane 900P)
OmniOS#2 - 1x 60GB vmdk on Optane 900P

CrystalDiskMark test on:
> ESXi Win10 VM stored on OmniOS#1 NFS share (3x NVMe Z1): r/w of just over 100MB/s, which is quite slow
> ESXi Win10 VM stored on OmniOS#2 NFS share (local vmdk Optane 900p): r/w of just over 115MB/s, which is quite slow

ESXi Win10 VM stored on local SSD (MX500 1TB): r:480MB/s w:340MB/s which is close to single SSD speed

I noticed the number of vCPU seems to be making a big difference to SMB writes to all pools (similar with NFS win client):
1x vCPU = 120MB/s
2x vCPU = 250MB/s
4x vCPU = 385MB/s
8x vCPU = 575MB/s
This is odd, and I would prefer to use just 2x vCPU for the OmniOS storage VM
reads: 1x vCPU = 130MB/s and 2x up to 8x vCPU remain steady at 250MB/s

Edit:
With 2x vCPU: ESXi reports CPU load for OmniOS VM#1 is constantly about 35%, auto-jobs and smb server are disabled, I'm not sure what it is doing? OmniOS VM#2 1x vCPU is about 13%, other VMs near 0%.
I tried exporting Tank1 to do more tests on Tank2, but same results.
During a file read or write to Tank2:
- ESXi reports CPU load at 100%;
- napp-it reports 100% 100% for 'busy (iostat): now, last 10s;
- napp-it CPU Process: zpool-Tank2/232 is at 5.4% during write, 0% during read
- during write, Realtime monitor reports Tank2 wait/waitlast 10s at 100%, w%/b% at 70%, avrg = 100%; iostat: read/write is
- during read, Realtime monitor reports Tank2 wait/waitlast 10s at 0%, w%/b% at 0%, avrg = 100%
- ARC & ARC-avrg at around 100%
- vmxnet3 around 1920Mb/s
- iostat: read/write for rel_avr_dsk & worst_dsk are mostly 100% / 1000/s, but drop briefly to 200/s intermittently


I have tried 2 different switches, with same results:
D-Link DGS-1510-52
MikroTik CRS305-1G-4S+IN

Workstation (Win10 pro 21H2):
Asrock Z690M-ITX/ax, i9-12900K, 64GB DDR4, nvidia 3080Ti FE (PCIe x16), Optane 905p 480Gb (M.2 x4), X550-T2 (M.2 x4), MX500 2TB (SATA)

Server (ESXi 7.0 U2):
Supermicro X11SCA-F, E-2278G (8c/16t), 128Gb DDR4 ECC
PCIe x8: Nvidia Quadro RTX 4000 (not in use during tests)
PCIe x8: AOC-SHG3-4M2P (4x M.2 PLX) with:
- 3x M.2 x4 Corsair MP510 2TB NVMe - Local ESXi, each with full disk vmdk's for OmniOS (should get from 2-4GB/s each)
- 1x M.2-PCIe x4 X710-DA2 - Local ESXi (should get from 2-4GB/s, depending on NVMe use)
PCIe x4: Nvidia T1000 (not in use during tests)
M2-2 x4: Intel Optane 900P 280GB - Local ESXi, store OmniOS #1 & #2 and vmdk's for OmniOS logs
PCIe x1: Broadcom 9400-16i (14x HDD) - Passthrough to OmniOS (should get up to 1GB/s)
AHCI: 6x SSDs (ESXi Boot, Scratch, Datastores etc) - Local ESXi
All devices with latest firmware, except for mainboard BIOS, which is 1.2 as this allows concurrent use of 2x GPU, iGPU and iKVM (latest 1.6a doesn't). mainboard BIOS settings configured for GPU's only, perhaps there are other settings I need to tweak.

Benchmark OmniOS#1 Tank 1 (3x NVMe) 2x vCPU:

Benchmark: Write: filebench, Read: filebench, date: 01.22.2022 pool: Tank1 NAME STATE READ WRITE CKSUM Tank1 ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 host SRV-NAPP pool Tank1 (recsize=128K, ssb=-, compr=off, readcache=none) slog - encryption - remark Fb3 randomwrite.f sync=always sync=disabled 10698 ops 12831 ops 2139.524 ops/s 2566.152 ops/s 5906us cpu/op 3970us cpu/op 0.5ms latency 0.3ms latency 16.6 MB/s 20.0 MB/s Fb4 fivestreamwrite.f sync=always sync=disabled 7755 ops 9328 ops 1545.201 ops/s 1840.705 ops/s 12834us cpu/op 10866us cpu/op 3.2ms latency 2.6ms latency 1544.2 MB/s 1839.7 MB/s ________________________________________________________________________________________ randomread.f randomrw.f fivestreamrea pri/sec cache=none 15.6 MB/s 39.6 MB/s 2.0 GB/s ________________________________________________________________________________________ Benchmark: Write: filebench, Read: filebench, date: 01.22.2022 host SRV-NAPP pool Tank1 (recsize=128K, ssb=-, compr=off, readcache=all) slog - encryption - remark Fb3 randomwrite.f sync=always sync=disabled 24927 ops 52424 ops 4985.103 ops/s 10472.366 ops/s 2840us cpu/op 1539us cpu/op 0.2ms latency 0.1ms latency 38.8 MB/s 81.7 MB/s Fb4 fivestreamwrite.f sync=always sync=disabled 5494 ops 9370 ops 1098.772 ops/s 1873.895 ops/s 15813us cpu/op 10634us cpu/op 4.5ms latency 2.6ms latency 1097.8 MB/s 1872.9 MB/s ________________________________________________________________________________________ randomread.f randomrw.f fivestreamrea pri/sec cache=all 227.8 MB/s 151.8 MB/s 3.4 GB/s ________________________________________________________________________________________


Benchmark OmniOS#1 Tank 2 (6x HDD + 20GB Optane 900P) 2x vCPU:

Benchmark: Write: filebench, Read: filebench, date: 01.22.2022 pool: Tank2 NAME STATE READ WRITE CKSUM Tank2 ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c0t5000CCA27DC5D283d0 ONLINE 0 0 0 c0t5000CCA27ED1CABCd0 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 c0t5000CCA27DC60CB2d0 ONLINE 0 0 0 c0t5000CCA27ED07C52d0 ONLINE 0 0 0 mirror-3 ONLINE 0 0 0 c0t5000CCA0BEC07FD8d0 ONLINE 0 0 0 c0t5000CCA0BEC1976Ed0 ONLINE 0 0 0 logs c1t2d0 ONLINE 0 0 0 host SRV-NAPP pool Tank2 (recsize=128K, ssb=-, compr=off, readcache=none) slog encryption - remark Fb3 randomwrite.f sync=always sync=disabled 400 ops 35 ops 79.997 ops/s 7.000 ops/s 44521us cpu/op 746280us cpu/op 12.5ms latency 104.2ms latency 0.6 MB/s 0.0 MB/s Fb4 fivestreamwrite.f sync=always sync=disabled 2178 ops 3016 ops 435.540 ops/s 603.149 ops/s 28219us cpu/op 17613us cpu/op 11.4ms latency 8.2ms latency 434.5 MB/s 602.1 MB/s ________________________________________________________________________________________ randomread.f randomrw.f fivestreamrea pri/sec cache=none 0.0 MB/s 0.0 MB/s 6.6 GB/s ________________________________________________________________________________________ Benchmark: Write: filebench, Read: filebench, date: 01.22.2022 host SRV-NAPP pool Tank2 (recsize=128K, ssb=-, compr=off, readcache=all) slog encryption - remark Fb3 randomwrite.f sync=always sync=disabled 13888 ops 27387 ops 2777.525 ops/s 5476.387 ops/s 2805us cpu/op 2108us cpu/op 0.4ms latency 0.2ms latency 21.6 MB/s 42.6 MB/s Fb4 fivestreamwrite.f sync=always sync=disabled 2338 ops 3031 ops 467.478 ops/s 606.116 ops/s 23011us cpu/op 12118us cpu/op 10.6ms latency 7.9ms latency 466.5 MB/s 605.1 MB/s ________________________________________________________________________________________ randomread.f randomrw.f fivestreamrea pri/sec cache=all 245.0 MB/s 425.6 MB/s 5.0 GB/s ________________________________________________________________________________________
 
Last edited:

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,068
In the end you can only increase efficiency ex with Jumbo or faster disks, reduce raid calculations ex with mirrors that also improves multistream read, reduce disk access with RAM or avoid extra load ex due encryption. If the CPU or disk load is at 100% the system is as fast as possible with cpu or disks the limiting factor for better values. For a filer avoid/disable sync (No Slog needed then) as this will reduce write performance massively. Use sync only for VM storage or databases.

btw
sequential sync write > 1000MB/s with async > 1800 MB/s is excellent for ZFS software Raid (do not forget that ZFS processes more data due checksums, has a higher fragmentation and is more iops limited than older/other filesystems). ZFS is not for highest performance but highest data security
 
Last edited:

ARNiTECT

Weaksauce
Joined
Aug 4, 2012
Messages
65
Hi Gea,
I thought the pool benchmarks looked ok too, I compared them with a few of your previous benchmark documents. From what I can see in the benchmarks, (maybe I'm reading them wrong) my pools should be able to get very close to 10Gb on reading and writing. iperf also reports 10Gb and much higher for the AiO VMs. So I don't know where the bottleneck is, resulting in only 250Mb/s each way on SMB to physical clients and only 100Mb/s to/from VMs.
I thought my CPU has quite good single core speed. Is there some other data I should be looking for to give me a clue?
 

ARNiTECT

Weaksauce
Joined
Aug 4, 2012
Messages
65
I have set 8x vCPU on OmniOS and I'm still struggling with slow VMs backed on NFS from OmniOS 3x NVMe Z1 (CrystalDiskMark of c: drive r/w:100MB/s)
During VM CDM benchmark, OmniOS CPU load is low, Pool use is low, wait low
iPerf3 is 23Gb/s which I would expect for vmxnet3 over software
The same VM backed on a single local SSD performs as expected (CrystalDiskMark of c: drive r/w:450MB/s)

iSCSI tests over physical network are very good (CrystalDiskMark MTU 9000: w:650MB/s & r:1050MB/s from all pools)
SMB 'reads' appear much slower than iSCSI (MTU 1500: w:700MB/s & r:300MB/s and MTU 9000: w:900MB/s & r:450MB/s)

The napp-it NFS and SMB properties are set to defaults.
I have run out of ideas of what to adjust in ESXi (latest VMtools, ethernet0.coalescingScheme = disabled )
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,068
Napp-it default tuning increases nfs/tcp/vmxnet3 buffers/servers.
Regarding ESXi you may try advanced settings of the napp-it VM and set latency to low.
 

ARNiTECT

Weaksauce
Joined
Aug 4, 2012
Messages
65
Regarding ESXi you may try advanced settings of the napp-it VM and set latency to low.
Thanks Gea, I tried setting napp-it VM's latency to low, but it hasn't helped.

I have just tried CrystalDiskMark on a NFS backed VM in my secondary server and I am getting much better speeds (r:500MB/s & w:400MB/s, with sync enabled)
This secondary server was setup more recently, but on older hardware: Supermicro X9-SCL-F, E3-1275 V2, 32GB ECC, ESXi 7.0U2, OmniOS r15221038 (2x vCPU 8GB), M1015: 4x 250GB SSD MX500 RAID-0.
The VMs are replicated from my main server, while I experiment (as previous discussion earlier in this thread).
I have a spare X520-DA2 I plan to install on this, and I am curious to see how it performs.

So, I am confident the speeds are possible with NFS backed VMs. Perhaps the problem with my main server is the result of years of updates, I would prefer to fix it how it is and understand why the VMs are capped at 1GB/s speeds, but maybe I should just export all pools and make a fresh install of OmniOS/napp-it and restore settings. Maybe reinstall ESXi too.
 
Last edited:

ARNiTECT

Weaksauce
Joined
Aug 4, 2012
Messages
65
I did a reinstall of the napp-it VM, which seemed to go smoothly, but unfortunately, the VM backed on NFS drive is still capped at 100MB/s.

reinstall process:
- In napp-it, I ran a backup job, stopped the services, removed the LUs, exported the pools
- I made a new VM using the current napp-it OVA, I set 2x vCPU, 48GB ram, added nics, HBA and existing vmdk drives.
- made a copy of VM (ESXi snapshots don't work with my full disk vmdk drives)
- booted VM, setup nics, updated to latest OmniOS r151040l and napp-it 22.01b, imported pools, user>restore settings/jobs, join domain, enter key, reboot VM and ESXi.
- boot up napp-it VM, NFS drives automatically shared
- boot Win 10 VM backed by NFS share, run CrystalDiskMark, reports 100MB/s
Interestingly, the interface is significantly quicker now; for example clicking on ZFS Filesystems and the list populating in less than 3sec rather than about 15sec before.

I did a quick benchmark of the Tank1 NVMe pool used for NFS and the results are fast as before.
 
Last edited:
Top