Fingerprint Filter
A filter that computes a (probably) unique identifier for each event and appends it as part of the data passed forward.
Dealing with duplicated records is a common problem in data pipelines. A common workaround is to use identifiers based on the hash of the data for each record. In this way, a duplicated record would yield the same hash, allowing the storage engine to discard the extra instance.
The fingerprint filter uses the non-cryptographic hash algorithm murmur3
to compute an id for each Oura event with a very low collision level. We use a non-cryptographic hash because they are faster to compute and non of the cryptographic properties are required for this use-case.
When enabled, this filter will set the fingerprint
property of the Event
data structure passed through each stage of the pipeline. Sinks at the end of the process might leverage this value as primary key of the corresponding storage mechanism.
Configuration
Adding the following section to the daemon config file will enable the filter as part of the pipeline:
[[filters]]
type = "Fingerprint"
Section: filter
type
: the literal valueFingerprint
.