ZFS pools on spinning disks deliver impressive sequential throughput but struggle with synchronous write operations. Every sync write must land on non-volatile storage before ZFS acknowledges the operation — and with HDDs, that means waiting for platter rotation. SLOG (Separate Intent Log) and Special VDEV are two ZFS features that specifically address this problem.
Understanding the ZFS Intent Log (ZIL)
Before discussing SLOG, we need to understand the ZIL. The ZFS Intent Log is a mechanism that secures synchronous write operations. During a sync write, the following happens:
- Data is written to the ZIL
- ZFS acknowledges the write to the application
- In the background, data is written to its final location in the pool during the next Transaction Group (TXG) commit
- After the TXG commit, the ZIL entry is released
The ZIL always exists — it is an integral part of every ZFS pool. By default, it resides on the pool disks themselves. The problem: when the ZIL sits on the same HDDs as the data, ZIL writes compete with regular I/O operations.
When Are Sync Writes Used?
Not every application uses sync writes. The key scenarios:
| Protocol/Application | Sync Writes? | Reason |
|---|---|---|
| NFS (sync=standard) | Yes | NFS standard requires sync |
| iSCSI | Yes | Block-level protocol, sync by default |
| SMB (Durable Handles) | Partially | Depends on client and configuration |
| Databases (PostgreSQL, MySQL) | Yes | fsync() for transaction safety |
| VMs on ZFS (zvol) | Yes | Guest OS expects sync confirmation |
| Local file operations | No (usually) | async by default |
SLOG: The Separate Intent Log
A SLOG is a dedicated device to which the ZIL is offloaded. Instead of landing on slow pool disks, the ZIL sits on a fast NVMe or Optane device.
What a SLOG Needs (and What It Does Not)
SLOG needs:
- Extremely high IOPS (4K random write)
- Very low latency (< 100 microseconds ideal)
- Power-Loss Protection (PLP) — absolutely critical
- Moderate capacity (8–32 GB usually sufficient)
SLOG does NOT need:
- High sequential throughput
- Large capacity (the ZIL holds data for only seconds)
- High endurance (TBW is rarely a concern)
Why So Little Capacity?
The ZIL stores data only until the next TXG commit (every 5 seconds by default). After that, data is moved to its final location and the ZIL entry is deleted. Even under heavy sync-write load, rarely more than a few gigabytes accumulate.
The rule of thumb:
SLOG capacity ≈ Maximum sync-write throughput × TXG timeout × 2
Example: 500 MB/s sync writes x 5 seconds x 2 (safety buffer) = 5 GB. A 16 GB SLOG is sufficient for the vast majority of workloads.
Setting Up a SLOG
# Add SLOG device to pool
zpool add tank log /dev/nvme0n1
# SLOG as mirror for redundancy (recommended)
zpool add tank log mirror /dev/nvme0n1 /dev/nvme1n1
# Check pool status
zpool status tank
Output:
pool: tank
state: ONLINE
config:
NAME STATE
tank ONLINE
raidz2-0 ONLINE
da0 ONLINE
da1 ONLINE
da2 ONLINE
da3 ONLINE
da4 ONLINE
da5 ONLINE
logs
mirror-1 ONLINE
nvme0n1 ONLINE
nvme1n1 ONLINE
Measuring SLOG Performance
Measure the difference before and after adding the SLOG:
# Test sync-write performance (without SLOG)
fio --name=sync-write --rw=randwrite --bs=4k --size=1G \
--numjobs=8 --sync=1 --direct=1 --filename=/tank/testfile
# Test again after SLOG installation
# Expected: 5–50x more IOPS for sync writes
Power-Loss Protection: Non-Negotiable
A SLOG without PLP is worse than no SLOG at all. The reason: ZFS acknowledges the sync write as soon as data lands in the ZIL (on the SLOG). If the SLOG device loses data during a power failure that has not yet been written to the pool, that data is irrecoverably lost — and ZFS has already told the application the write was safe.
Devices with PLP:
- Intel Optane (all models)
- Enterprise NVMe with PLP (e.g., Samsung PM9A3, Micron 7450)
- Enterprise SATA SSDs (e.g., Intel S4610, Samsung PM893)
Devices WITHOUT PLP (do not use as SLOG):
- Consumer NVMe (Samsung 990 Pro, WD Black SN850X)
- Consumer SATA SSDs (Samsung 870 EVO, Crucial MX500)
Special VDEV: Accelerating Metadata
The Special VDEV is a newer ZFS feature (since OpenZFS 0.8) that addresses a different bottleneck: metadata and small blocks. ZFS stores metadata (directory structures, file attributes, deduplication tables) on the pool disks by default. On large pools with millions of files, metadata lookups become the bottleneck.
What the Special VDEV Stores
A Special VDEV handles:
- Metadata (dnode blocks, directory contents, filesystem metadata)
- Small data blocks (configurable via
special_small_blocks_threshold) - Deduplication tables (DDT — when dedup is enabled)
Setting Up a Special VDEV
# Add Special VDEV (mirror recommended)
zpool add tank special mirror /dev/nvme2n1 /dev/nvme3n1
# Set threshold for small blocks (e.g., 128 KB)
zfs set special_small_blocks=128k tank
# Check pool status
zpool status tank
Output:
pool: tank
state: ONLINE
config:
NAME STATE
tank ONLINE
raidz2-0 ONLINE
da0 ... da5 ONLINE
logs
mirror-1 ONLINE
nvme0n1 ONLINE
nvme1n1 ONLINE
special
mirror-2 ONLINE
nvme2n1 ONLINE
nvme3n1 ONLINE
Sizing the Special VDEV
The Special VDEV requires significantly more capacity than a SLOG since metadata is stored permanently:
Special VDEV capacity ≈ Pool capacity × metadata ratio + small blocks
Rules of thumb:
- Without small blocks: 1–5% of pool capacity for pure metadata
- With small_blocks=128k: 5–20% of pool capacity (depending on workload)
For a 100 TB pool:
- Pure metadata: 1–5 TB NVMe
- With small blocks: 5–20 TB NVMe
When Is a Special VDEV Worth It?
| Scenario | Benefit |
|---|---|
| Millions of small files (email, home shares) | High — metadata lookups become dramatically faster |
| Few large files (video, backup images) | Low — minimal metadata load |
| Dedup enabled | High — DDT on NVMe is orders of magnitude faster |
| Many snapshots | Medium — snapshot metadata benefits |
Hardware Selection: Intel Optane as the Gold Standard
Intel Optane is based on 3D XPoint technology and offers characteristics ideal for ZFS log devices:
| Property | Optane P5800X | Enterprise NVMe | Consumer NVMe |
|---|---|---|---|
| 4K Random Write Latency | 6 microseconds | 15–30 microseconds | 50–200 microseconds |
| 4K Random Write IOPS | 1.5M | 200–500K | 50–100K |
| Power-Loss Protection | Yes | Yes (Enterprise) | No |
| Endurance (DWPD) | 100 | 1–3 | 0.3–1 |
| Price (2026) | High (end of life) | Medium | Low |
Optane availability: Intel has discontinued Optane production. Remaining stock is still available, but prices are rising. Alternatives like Samsung PM9A3 or Kioxia FL6 offer good performance with PLP but cannot match Optane’s latency figures.
Optane Models for SLOG and Special VDEV
- Optane P1600X (118 GB): Ideal as SLOG — compact, affordable, PLP
- Optane P5800X (400/800 GB): Ideal as Special VDEV — high capacity, extreme performance
- Optane 905P/900P: Older generation, but still excellent for SLOG
Risks When SLOG Fails
SLOG Failure (Non-Redundant)
If a single SLOG without mirror fails:
- Pool stays online — ZFS automatically falls back to the pool-internal ZIL
- Performance drops to the level without SLOG
- No data loss (as long as no power failure occurs during the transition)
Partial SLOG Mirror Failure
With a SLOG mirror where one device has failed:
- Pool stays online with full performance
- Redundancy is gone — replace the failed device promptly
Special VDEV Failure
A failed Special VDEV is critical:
- The entire pool becomes unreadable if the Special VDEV is lost
- A Special VDEV must be configured as a mirror
- Consider a 3-way mirror for critical data
# Special VDEV as 3-way mirror (highest safety)
zpool add tank special mirror /dev/nvme2n1 /dev/nvme3n1 /dev/nvme4n1
Monitoring
Monitor SLOG and Special VDEV regularly:
# SLOG and Special VDEV status
zpool status tank
# I/O statistics for log devices
zpool iostat -v tank 5
# Check SMART values of NVMe devices
smartctl -a /dev/nvme0n1
Pay special attention to:
- Wear Level (Percentage Used) — usually uncritical for enterprise NVMe
- Available Spare — should be above 10%
- Media Errors — should remain at 0
Conclusion
SLOG and Special VDEV are not universal solutions but targeted optimizations for specific bottlenecks. A SLOG is worthwhile with high sync-write workloads (NFS, iSCSI, databases), while a Special VDEV benefits metadata-intensive workloads (many small files, dedup). The right hardware choice — especially Power-Loss Protection — is not optional but essential for data integrity.
More on these topics:
More articles
Backup Strategy for SMBs: Proxmox PBS + TrueNAS as a Reliable Backup Solution
Backup strategy for SMBs with Proxmox PBS and TrueNAS: implement the 3-2-1 rule, PBS as primary backup target, TrueNAS replication as offsite copy, retention policies, and automated restore tests.
TrueNAS with MCP: AI-Powered NAS Management via Natural Language
Connect TrueNAS with MCP (Model Context Protocol): AI assistants for NAS management, status queries, snapshot creation via chat, security considerations, and future outlook.
TrueNAS Configurator: Configure Storage Live — From Mini to V-Series
The DATAZONE TrueNAS Configurator: web-based model selection with live capacity calculation, bay visualisation and shareable configuration codes for all six TrueNAS series.