uid

uid is a Scala library for the generation and handling of 64-bit unique ids. It is inspired by Twitter Snowflake and it aims to be more flexible.

Motivation

Modern web applications are polyglot and they are deployed on multiserver environments. Ids are used on DBMS, including NoSQL systems, server-side applications and web browsers.

UUIDs are safe and quick but they are not handy on databases and on browsers. Also, UUIDs are usually not ordered. Twitter Snowflake is an excellent solution but it is designed to fulfill specific requirements and to fit to Twitter’s architecture. For instance, it includes a Thrift Server, it is configured by Zookeper and its ids are consstructed according to a defined format.

In consideration of the above premises, a 64-bit id solution seems convenient for databases and server-side applications. Web browsers, on the contrary, as JavaScript represents numbers according to the IEEE 754 standard, don’t handle well 64-bit integers. For browsers, a decent string representation of the ids could be a useful.

To conclude, the targets if this library are summarized below:

Setup

uid is published on the Sonatype and the Maven Central repositories. uid 1.1 is built for Scala 2.10. The difference from 1.0 is the use of Value Classes.

Documentation

uid use is very simple. You define the structure of the ids via a Scheme and then you generate ids via a Generator. You can choose to generate ids as Longs or use the Id type of the library.

import gr.jkl.uid.{ Scheme, Generator }

// Define the Id specification with the following parameters:
// timestamp: 42 bits
// node     : 12 bits
// sequence : 10 bits
// epoch    : 1351728000000L (01 Nov 2012 00:00:00 GMT)
implicit val scheme = Scheme(42, 12, 10, 1351728000000L)

// Construct an Id Generator for a machine with id 0
val generator = Generator(0L)

// Create a new Id
val id = generator.newId

// Create a new Id as a Long
val longId = generator.newLong

Specifying Id Structure

An id occupies 64-bits and is composed of:

The number of bits devoted to each of the parameters and the epoch is defined by a Scheme. Various parts of the library require a scheme and it is convenient to define one implicitly.

import gr.jkl.uid.Scheme

val scheme = Scheme(
  timestampBits = 44, 
  nodeBits      = 12, 
  sequenceBits  = 8,
  epoch         = 1351728000000L)

scheme.maxTimestamp 
// res0: Long = 18943914044415

scheme.maxNode 
// res1: Long = 4095

scheme.maxSequence 
// res2: Long = 255

Deconstructing Ids

Id is a Value Class with an underlying Long and it contains methods which extract its parameters. Id’s companion object contains factory methods, extractors and Orderings.

import gr.jkl.uid.{ Id, Scheme }
val id = Id(-9217076510208286673L)
implicit val scheme = Scheme(44, 12, 8, 1351728000000L)

id.timestamp
// res0: Long = 1357731882071

id.node
// res1: Long = 32

id.sequence
// res2: Long = 47

id.underlying
// res3: Long = -9217076510208286673

Id.create(1357731882071L, 32, 47)
// res4: Option[gr.jkl.uid.Id] = Some(--LMQy4R1-j)

Timestamp, node and sequence, among other methods, depend on the id’s underlying Long and on the scheme. A different scheme would produce different results.

import gr.jkl.uid.{ Id, Scheme }

val id = Id(-9217076510208286673L)

implicit val scheme = Scheme(43, 16, 5, 1357700000000L)

id.timestamp
// res0: Long = 1360701941035

id.node
// res1: Long = 33025

id.sequence
// res2: Long = 15

Representing Ids as Strings

Ids are represented as strings with a Base64 like URL-safe encoding. The encoding is based on the following index table.

ValueChar  ValueChar  ValueChar  ValueChar
0- 16F 32V 48k
1017G33W49l
2118H34X50m
3219I35Y51n
4320J36Z52o
5421K37_53p
6522L38a54q
7623M39b55r
8724N40c56s
9825O41d57t
10926P42e58u
11A27Q43f59v
12B28R44g60w
13C29S45h61x
14D30T46i62y
15E31U47j63z

When ids are converted to Strings they are treated as unsigned values. Apart from toString, you can use toShortString which creates shorter strings omitting zeros from the beginning. Id's companion object contains a string extractor which can be used for pattern matching.

import gr.jkl.uid.Id

val id = Id(-9217076510208286673L)

val stringId = id.toString
// stringId: String = --LMQy4R1-j

val shortStringId = id.toShortString
// shortStringId: String = LMQy4R1-j

stringId match {
  case Id(a) => println(s"Valid id: $a")
  case _      => println("Invalid id")
}
// Valid id: --LMQy4R1-j

shortStringId match {
  case Id(a) => println(s"Valid id: $a")
  case _      => println("Invalid id")
}
// Valid id: --LMQy4R1-j

Warning: Short string decoding is broken on versions 1.0 and 1.1.

Sorting

Ids are roughly sortable. Like Twitter Snowflake they are k-sorted. The default Ordering, which is defined implicitly, sorts ids first by the timestamp, then by the node and then by the sequence. Sorting ids as Strings or Longs will produce the same result.

import gr.jkl.uid.Id

val ids = List.fill(1000)(Id(util.Random.nextLong))

val sortedIds = ids.sorted

val longIds = ids.map(_.underlying)

val sortedLongIds = longIds.sorted

val stringIds = ids.map(_.toString)

val sortedStringIds = stringIds.sorted

sortedLongIds == sortedIds.map(_.underlying)
// res0: Boolean = true

sortedStringIds == sortedIds.map(_.toString)
// res1: Boolean = true

It’s easy to implement an ordering for a class which contains an id.

import gr.jkl.uid.Id
import scala.math.Ordering

case class Comment(id: Id, commenter: String, body: String)

object Comment {
  implicit val IdOrdering: Ordering[Comment] = Ordering by (_.id)  
}

Alternatively, you can sort ids first by the timestamp, then by the sequence and then by the node. This ordering requires a scheme.

import gr.jkl.uid.{ Id, Scheme }

val ids = List.fill(1000)(Id(util.Random.nextLong))

implicit val scheme = Scheme(44, 12, 8, 1351728000000L)

val sortedIds = ids.sorted(Id.TimeSequenceNodeOrdering)

Generating Ids

A thread-safe non-blocking Generator is included in the library which relies on the System clock to generates ids. You construct a generator having a defined scheme and providing a node id. A generator produces unique Ids or Longs.

import gr.jkl.uid.{ Scheme, Generator }

implicit val scheme = Scheme(44, 12, 8, 1351728000000L)

val generator = Generator(714)

generator.newId
// res0: gr.jkl.uid.Id = --LRP8nVgc-

generator.newLong
// res1: Long = -9217054643603715584

The generator produce ids according to the following policies:

The above policies are implemented by the following logic:

  1. If current timestamp is greater than the timestamp of the last id, the new id will be constructed by the current timestamp and a sequence equal to 0.
  2. If current timestamp is less than or equal to the timestamp of the last id and the sequence of the last id is less than the maximum scheme’s sequence, the new id will be constructed by the previous timestamp and the previous sequence incremented by 1.
  3. If current timestamp is less than or equal to the timestamp of the last id and the sequence of the last id is equal to the maximum scheme’s sequence, the new id will be constructed by the previous timestamp incremented by 1 and a sequence equal to 0.

Synchronizing System Clock

Id Generator depends strongly on the system’s clock. Although the generator will continue to produce unique ids when clock goes back, it’s better to synchronize your server’s time with NTP in a mode which doesn't move the clock backwards.

If your can’t control time synchronization and your clock may move backwards it is suggested to instantiate your id generators providing the last id generated for each node.

// Constructs an Id generator for node 0 which will 
// generate ids that come after the lastId
val generator = Generator(0L, lastId)

API

Stable Releases

Development Release

Source Code

uid is hosted at https://github.com/nevang/uid. Get the source code by cloning the repository:

git clone https://github.com/nevang/uid.git

License

uid is released under the Simplified BSD license.

Copyright (c) 2013, Nikolas Evangelopoulos All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Fork me on GitHub