Flipside AI Workflow Breakdown

What happens when you run a Flipside AI Workflow? Let's understand the data used, architecture followed, sequences implemented, pitfalls avoided, and analysis performed under the hood.

CEFI Bridge Intelligence - Technical Data Flow

Business Requirement

Track institutional capital movement across blockchains to understand which centralized entities (funds, known institutional traders, exchanges, custodians, etc) are moving liquidity between chains, through which bridge platforms, and in what volumes.

The Data Architecture

Primary Dataset: crosschain.core.dim_labels

Purpose: Entity identification registry
Contains: 45+ chains worth of labeled addresses mapping wallet addresses to real-world entities
Key filters: label_type IN ('cefi', 'cex') and label_subtype IN ('vault', 'hot_wallet', 'treasury')
Critical fields: address, project_name (e.g., "Binance", "Coinbase"), label_type, label_subtype, blockchain
Why it matters: Without this, you're looking at anonymous addresses. This table provides the Rosetta Stone that says "0xABC123 is Binance's hot wallet on Ethereum"

Primary Dataset: crosschain.defi.ez_bridge_activity

Purpose: Cross-chain transfer event log
Contains: Every bridge transaction across 16 chains since the table's inception
Critical fields: tx_hash, source_address, destination_address, source_chain, destination_chain, platform (bridge name), amount, amount_usd, symbol, token_is_verified, block_timestamp
Why it matters: This is the movement ledger - who sent what token, from which chain, to which chain, through which bridge, and when

The Sequence

Step 1: Build the Entity Filter (Subquery)

Query: crosschain.core.dim_labels
Filter: label_type IN ('cefi', 'cex') AND label_subtype IN ('vault', 'hot_wallet', 'treasury')
Output: Temporary address whitelist (~thousands of addresses across 45 chains)

This creates a focused filter of institutional operational addresses. We're not looking for every Coinbase user - we're looking for Coinbase's own infrastructure addresses that move institutional liquidity.

Step 2: Join Against Bridge Activity (Main Query)

Query: crosschain.defi.ez_bridge_activity
Join: LEFT JOIN entity_filter ON source_address OR destination_address
Filter: (src.address IS NOT NULL OR dest.address IS NOT NULL) AND token_is_verified = TRUE AND block_timestamp > '2024-01-01'
Time range: Jan 2024 - present

Critical join logic: We join on BOTH source and destination addresses because institutional activity can occur in either direction:

Outbound: Institution bridging funds TO another chain
Inbound: Institution receiving bridged funds FROM another chain

The (src.address IS NOT NULL OR dest.address IS NOT NULL) ensures we capture transactions where institutions are on either side.

Step 3: Fraud Pattern Exclusion

Filter: NOT (source_project_name IS NULL AND destination_address_type = 'hot_wallet')

This removes suspicious patterns where unknown projects are sending directly to hot wallets - likely scam tokens or wash trading attempts.

Step 4: Aggregation Structure

Group by: month, bridge_platform, source_chain, destination_chain, source_project_name, source_address_type, destination_project_name, destination_address_type, token_symbol

This granular grouping preserves analytical flexibility - you can roll up to entity-level summaries or drill down to specific chain corridors and token movements.

Computed metrics per group:

COUNT(DISTINCT tx_hash) - transaction volume
COUNT(DISTINCT source_address) - unique senders
COUNT(DISTINCT destination_address) - unique receivers
SUM(amount_usd) - total dollar value moved
SUM(amount) - total token quantity moved

Why This Order Matters

Entity identification first: You must know WHO before you can filter WHAT. Building the labeled address filter upfront prevents scanning billions of irrelevant bridge transactions.
Dual-sided join: Institutions appear on both sides of bridge transactions. A single-sided join would miss half the activity.
Time-scoped retrieval: Starting from 2024-01-01 balances data freshness with query performance. Bridge activity compounds quickly - earlier dates add diminishing analytical value.
Verified tokens only: Filters out spam tokens and scam projects that would otherwise inflate volume metrics.
Monthly granularity: Strikes balance between temporal precision and data volume. Daily would create sparse results for many entities; quarterly would obscure short-term shifts.

Data Pitfalls Avoided

No double-counting: The join structure ensures each bridge transaction is counted once, even though it involves two chains
No retail user pollution: Filtering by label_subtype excludes individual exchange users, focusing only on institutional operational wallets
No spam token inflation: token_is_verified = TRUE removes wash trading and scam tokens from volume calculations
No ambiguous attribution: The dual project_name fields (source and destination) preserve directional context - critical for understanding if Binance is sending TO Optimism or receiving FROM it

Result Structure

The query returns a denormalized fact table where each row represents:

Who: Source and destination entities (with types)
What: Token symbol and quantities
Where: Source chain → Destination chain corridor
How: Bridge platform used
When: Month of activity
How much: Transaction count and USD volume

This structure supports immediate analysis without additional joins, enabling rapid calculation of entity rankings, chain corridor analysis, bridge platform market share, and temporal trend detection.

← Back to Workflows