ID Unification Output
Database generated by ID Unification
When executing ID Unification, a database is created following the naming convention cdp_unification_${unification_name}
. For example, if the unification name specified at the beginning of the Unification YML file is production
, a database named cdp_unification_production
is created in the account, and the output tables are stored under it.
name: production # used as unification database name
keys:
- name: td_client_id
valid_regexp: "[0-9a-fA-F]{8}-..."
invalid_texts: ['']
List of Generated Tables
Table Name | Example Table Name (Top) Naming Rule (Bottom) | Write Type | Description |
---|---|---|---|
enriched_ | enriched_site_aaaenriched_${source_table_name} | Overwrite | Tables enumerated in tables: in the YML, enriched with canonical_id . These can be linked with master_table using canonical_id as attribute_table /behavior_table . If multiple master_table entries are defined, they are also enriched. |
master_table | master_table_ex1 Name defined in the YML's master_table | Overwrite | A master table containing unique canonical_id records, along with additional columns specified in attributes . |
result_key_stats | unified_cookie_id_result_key_stats ${canonical_id_name}_result_key_stats | Append | Statistics of the unification results, such as unique canonical_id counts per key or histograms showing how many keys are associated with a single canonical_id . |
source_key_stats | unified_cookie_id_source_key_stats ${canonical_id_name}_source_key_stats | Append | Statistics of source tables before unification, such as unique key counts or counts of unique combinations of keys. |
graph | unified_cookie_id_graph ${canonical_id_name}_graph | Overwrite | The final loop result of the unification algorithm. |
graph_unify_loop_N | unified_cookie_id_graph_unify_loop_0 ${canonical_id_name}_graph_unify_loop_${N} | Overwrite | Output for each loop iteration of the unification algorithm, where the next loop uses the previous loop's result as input. |
lookup | unified_cookie_id_lookup ${canonical_id_name}_lookup | Overwrite | A table for looking up canonical_id for each key. This table is created based on the graph table and is used when creating enriched and master tables. |
keys | unified_cookie_id_keys ${canonical_id_name}_keys | Overwrite | An internal reference table used by the lookup table, recording unique IDs assigned to keys listed in keys: . |
tables | unified_cookie_id_tables ${canonical_id_name}_tables | Overwrite | An internal reference table used by the lookup table, recording unique IDs assigned to tables listed in tables: . |
Categorization of Generated Tables
Table Category per Table types
Tables Per canonical_id
The following tables are output for each canonical_id
:
graph_unify_loop_N
graph
lookup
keys
tables
source_key_stats
result_key_stats
Tables Per master_table
master_table
Tables per Unification Workflow
enriched
Although multiple master_table
entries can be defined, only one enriched table per source table is output within the database.
Table Category per Process
Tables for Utilizing Results
master_table
enriched
Tables for Verifying Unification Success
source_key_stats
result_key_stats
Tables Generated During the Unification Algorithm Process
graph_unify_loop_N
graph
Tables for Mapping canonical_id
and Keys
lookup
keys
tables
Relationship Between Workflow Tasks and Output Tables
The following table summarizes key tasks in the workflow and their corresponding output tables:
Task Name | Output Table | Description |
---|---|---|
+extract_and_merge | graph_unify_loop_0 | Generates the initial graph for the unification algorithm. |
+source_key_stats | source_key_stats | Outputs statistical information based on source tables. |
+loop-N > +iteration | graph_unify_loop_N | Generates the graph for each iteration of the unification algorithm. |
+loop-N > +report_diff | None (Log Output) | Outputs the number of changes in the graph compared to the previous loop. A count of 0 indicates convergence. |
+canonicalize | lookup | Creates a table assigning canonical_id to all keys. |
+result_key_stats | result_key_stats | Outputs statistics related to the generation of canonical_id . |
+enrich | enriched_* | Enriches source tables with canonical_id . |
+master_tables > +build | master_table | Generates the master table. |