
Table of Contents
Open Table of Contents
Introduction
In this survey, I compare seven widely used pseudo-random unique-ID generators—UUID (v4 & v7), ULID, NanoID, MongoDB ObjectId, KSUID, TSID, and Twitter’s Snowflake—along key dimensions: bit-width, text representation, encoding, embedded timestamp (and sortability), randomness, collision resistance, human-readability, and potential pitfalls (check digits, “magic”/version bits, ambiguous characters). A consolidated comparison table is provided up front, followed by in-depth overviews of each scheme, discussion of strengths and weaknesses, and recommendations for scenarios ranging from high-throughput distributed systems to compact web tokens.
Comparison Table
Compiled from official specifications and community benchmarks (Wikipedia, GitHub, GitHub, MongoDB, GitHub, Foxhound Systems, Wikipedia)
| Generator | Bit-Width | String Length | Encoding | Timestamp Bits | Random Bits | Sortable? | Check Digit | Character Set | Ambiguous Characters |
|---|---|---|---|---|---|---|---|---|---|
| UUID v4 | 128 | 36 (8-4-4-4-12) | Hexadecimal + hyphens | 0 (v4 is purely random) | 122 (6 version+variant bits) (Wikipedia) | No | No | 0–9, a–f, ‘–’ | 0↔O, 1↔I/l |
| UUID v7 | 128 | 36 | Hex + hyphens | 48 (Unix ms epoch) (npm) | 74 (version+counter+random) (npm) | Yes (lexicographic) | No | 0–9, a–f, ‘–’ | 0↔O, 1↔I/l |
| ULID | 128 | 26 | Crockford Base32 | 48 (Unix ms epoch) (GitHub) | 80 (GitHub) | Yes (lexicographic) | No | A–Z, 0–9 (no I, L, O, U) (GitHub) | Minimal (designed safe) |
| NanoID | Variable | ~21 (default) | URL-safe Base64 variant | 0 (fully random) | ~168 bits of randomness (21 × 6 bits) (GitHub) | No | No | A–Z, a–z, 0–9, ‘_’, ‘-’ (GitHub) | ‘_’↔‘-’, O↔0, l↔1 |
| ObjectId | 96 | 24 (hex) | Hexadecimal | 32 (seconds since Unix epoch) (MongoDB) | 40 (5-byte random value) + 24 (counter) (MongoDB) | Roughly (per-second order) | No | 0–9, a–f | Minimal (hex only) |
| KSUID | 160 | 27 | Base62 | 32 (big-endian UTC seconds since 2014-05-13) (GitHub) | 128 (GitHub) | Yes (lexicographic) | No | 0–9, A–Z, a–z | 0↔O, 1↔I/l |
| TSID | 64 | 13 | Crockford Base32 | 42 (ms-precision since custom epoch) (Foxhound Systems) | 22 (random/counter/node mix) (Foxhound Systems) | Yes (numerical order) | No | A–Z, 0–9 (no I, L, O, U) (Foxhound Systems) | Minimal (safe set) |
| Snowflake | 64 | ~18–19 digits | Decimal | 41 (ms since custom epoch) (Wikipedia) | 12 (sequence) + 10 (machine) (Wikipedia) | Yes (numerical order) | No | 0–9 | None (digits only) |
1. UUID (Universally Unique Identifier)
UUIDs are 128-bit identifiers standardized by RFC 4122.
- UUID v4 is purely random (122 bits of entropy) with 6 fixed version/variant bits; average collision risk is negligible (2^122 possible values) (Wikipedia).
- UUID v7 embeds a 48-bit Unix-millisecond timestamp followed by random bits, restoring sortability while retaining randomness (npm).
- Strengths: Universally supported across databases, languages, and platforms; no coordination needed (Wikipedia).
- Weaknesses: v4 is not ordered; hyphenated hex is bulky; index fragmentation in databases; version bits act as “magic” markers but can break lexicographic order (Wikipedia).
- Use Cases: v4 for general-purpose IDs where order doesn’t matter; v7 for time-series keys in high-load databases.
2. ULID (Universally Unique Lexicographically Sortable Identifier)
ULID is a 128-bit, Base32-encoded ID combining a 48-bit millisecond timestamp and 80 bits of randomness (GitHub).
- Strengths: Lexicographically sortable, URL-safe, case-insensitive, no ambiguous characters (omits I, L, O, U) (GitHub).
- Weaknesses: Randomness within the same millisecond is not sequenced; 26-character string still longer than decimal Snowflake IDs (GitHub).
- Use Cases: Distributed databases needing sortable keys; log-aggregation; offline-first systems.
3. NanoID
NanoID generates compact, cryptographically secure, URL-friendly IDs in JavaScript (and >20 languages), defaulting to 21 characters (≈168 bits) from an alphabet of 64 symbols (A–Z, a–z, 0–9, ‘_’, ‘–’) (GitHub).
- Strengths: Extremely small bundle (< 120 B gzip), customizable length and alphabet, no external dependencies (GitHub).
- Weaknesses: Not time-sortable; default alphabet may include confusing symbols (‘_’ vs ‘–’, O↔0, l↔1) (GitHub).
- Use Cases: Front-end session tokens; short links; non-sequential random IDs.
4. MongoDB ObjectId
ObjectId is a 12-byte BSON type: 4 bytes timestamp (seconds), 5 bytes random value, 3 bytes counter (MongoDB).
- Strengths: Rough insertion order; compact 24-hex string; built into MongoDB with getTimestamp() support (MongoDB).
- Weaknesses: Only second-level resolution; client clocks may differ; not strictly monotonic; no textual check digit (MongoDB).
- Use Cases: Default MongoDB document IDs; document versioning.
5. KSUID (K-Sortable Unique IDentifier)
KSUID is a 20-byte (160 bit) ID: a 32-bit big-endian “KSUID epoch” timestamp + 128 bits of randomness, encoded as 27 Base62 characters (GitHub).
- Strengths: Sortable by creation time; over 100 years of timestamp range; high collision resistance (2^128) (GitHub).
- Weaknesses: Base62 includes all letters and digits, leading to potential 0/O and 1/I/l confusion (GitHub).
- Use Cases: Segment’s analytics pipeline; clustered‐index keys in SQL databases.
6. TSID (Time-Sorted Unique IDentifier)
TSID is a 64-bit integer: leading 42 bits for ms-precision timestamp + 22 bits of random/node/counter, optionally including a node ID, rendered as 13 Crockford Base32 characters (Foxhound Systems).
- Strengths: Drop-in replacement for 64-bit auto-increment; efficient indexing; human-readable 13-char URLs; strong collision resistance (Foxhound Systems).
- Weaknesses: Millisecond granularity can reorder within the same ms; configuration complexity for multi-node setups; requires user-defined generator (Foxhound Systems).
- Use Cases: SQL primary keys where ordering and compactness matter; microservices needing consistent 64-bit IDs.
7. Snowflake (Twitter)
Snowflake is a 64-bit integer: 41 bittimestamp (ms since custom epoch) + 10 bits machine ID + 12 bits per-ms sequence, serialized as a decimal string (Wikipedia).
- Strengths: Numerical IDs (no alphabet confusion); monotonic per-machine; extractable timestamp; widely adopted by Twitter, Discord, Instagram (Wikipedia).
- Weaknesses: Epoch and bit allocations are fixed; requires coordination for machine IDs; long decimal (~18 digits) can be cumbersome in URLs (Wikipedia).
- Use Cases: Social‐media post IDs; high-scale event logs; distributed systems requiring per-node unique sequences.
Recommendations
- Compact, random web tokens: NanoID
- Universal library support & large address space: UUID v4 (random) or UUID v7 (sorted)
- Ordered distributed IDs: ULID (128 bits) or KSUID (160 bits)
- Space-efficient SQL primary keys: TSID (64 bits)
- MongoDB specific: ObjectId
- High-throughput event streaming: Snowflake (per-node sequences)
- When timezone auditing is critical: Snowflake or ULID/UUID v7 (explicit timestamp)
- Avoid confusing characters: choose ULID or TSID (omit I, L, O, U)
Each generator balances trade-offs in size, sortability, entropy, and human-factors. Your ideal choice depends on whether you prioritize pure randomness, timestamp ordering, storage efficiency, or ease of transcription.