The Deduplicate Operator identifies entries with identical properties, ensuring that unique inputs are treated as desired.
The operator can help deduplicate input such as:
Trigger content
Events burst handling
Array content
Object data
Repeated IOC Matches for Known Threat
Repeated Failed Login Attempts
Phishing Email Reports
Cross Product Aggregation
and more!
The Deduplicate operator checks backward throughout each of the workflow's executions from within the designated time range for duplicated events.
Any input not defined within the Deduplication's input operator will pass through the Unique branch.
For example, if you define the input to
x
, and then the step runs through an array such as:[x, y, x, z, y]
only the firstx
event will be defined as unique. In this instance, the array passed through the Unique branch will be[x, y, z, y]
. The array value in array space 2 (the secondx
) will be sent through the Duplicate branch.
How it Works
Drag the Deduplicate operator onto the canvas.
Input: Define what input expression you want to define as unique.
The input can be anything:
an entire object or a specific field within an object
an entire array or a specific value within an array,
a value (string, integer, boolean, etc.)
etc.
You can define multiple expressions in your Input by clicking Add Expression. Note that these will be referenced in the AND capacity, which means that both the values must be present in the execution to be declared "Unique."
Time range: Define how far backward you want the operator to check for matching values. The maximum evaluation period is 31 days. After the designated period has passed, the uniqueness count will reset.
Number of executions: Define how many instances of the input you want to be allowed to be defined as Unique, across all of the workflow executions for the given time range. The default is 1, and the maximum is 1000 instances.
Branches: Steps placed on the Unique branch will be executed on the input that is passed through as Unique, which includes not only the defined input but also any other value in the defined given object. The steps placed on the Duplicate branch will be executed on input defined as Duplicate.
For example:
You specify in the input parameter IP address
192.168.1.1
with a time range of2 days
and a number of executions of3
.On the first day, the workflow receives two events with the IP address
192.168.1.1
, both of which are allowed to pass through to the Unique branch.On the second day, only the first event with the IP address
192.168.1.1
is passed through the Unique branch; each following execution event of192.168.1.1
is passed through the Duplicate branch.However, on the third day, the count will restart and allow
192.168.1.1
through three times over the next2 days
.Every IP address that arrives that is not
192.168.1.1
will be passed through the Unique branch.
When to Use The Deduplicate Operator
Below are some examples of use cases for the Deduplicate Operator. Many other use cases are possible, as the operator is very versatile and can be used skillfully within Loops, Switch, etc.
Duplicated Trigger Events:
Duplicated trigger events are events with identical properties (e.g. “user_id”:”123”
) that unnecessarily execute the same workflow multiple times within a specific time range. This aids in alert consolidation.
Duplicated Case Creation:
A firewall may detect malicious traffic and open a case, and then malicious traffic from the same scr_ip address may be detected shortly thereafter. Instead of opening an identical case, use the deduplication operator to aggregate the same phenomena into the same case.
Duplicated Asset Inventory:
When pulling asset data from multiple data sources, such as CMDBs, network scans, and endpoint agents, duplicate entries can frequently occur for the same asset (e.g., a device listed with slightly different names or IPs). Using the deduplication operator, you can consolidate these duplicate entries into a single, accurate record in the asset inventory.
Keep in Mind
Up to 50 events can be defined as unique. To increase this amount, contact your Torq support representative.
The path limit for the input parameter is 5MB.