Attention Index
wtf is the Attention Index
The Attention Index measures near-real-time mindshare for a topic (e.g., “bitcoin”, “Elon Musk”) by aggregating public engagement data from X (Twitter), Reddit, and YouTube into a single, unitless Engagement Index (EI), then mapping EI to a monetized, tradable scale called DoA (Dollar of Attention) for use in Trendle markets. Computation runs every minute over a rolling context window to capture spikes and short-term persistence. Built on Azuro’s indexing and data infrastructure.
No artificial intelligence or generative systems used. The index is computed with deterministic, transparent math: normalization, exponential time-decay, quantile clipping, and light smoothing.

Data Ingest: Inputs & Collectors
Independent collectors continuously pull raw events per tracked topic and persist them for downstream processing:
X (Twitter) Collector: tweet-level public engagement (retweets, replies, likes, quotes, bookmarks, impressions) plus author/post metadata.
Reddit Collector: activity across relevant subreddits/feeds (e.g., hot/rising/top); metrics include scores and comment counts.
YouTube Collector: video/channel signals (views, likes, comments) for topic-matched videos.
Coming in v2: We’ll integrate Google Search attention signals (topic-matched search interest and query dynamics) as an additional collector. These features will follow the same normalization, deseasonalization, and decay rules and will flow into EI → DoA like other sources.
Data Preparation (minute-level, topic-bounded)
Time-bounded pull: load the context window of events per source for each topic.
Per-minute grid: convert source events to 1-minute bars and join all sources on a common minute grid.
Gaps handling: use forward-fill (and limited backward-fill) so each minute has a dense matrix of features; track completeness.
Quality controls: deduplication, basic low-quality/bot heuristics, and language/topic filters (as described in the prep stage).
Feature Families
Azuro's infra constructs multiple feature families per minute and topic.
Reddit (14 metrics)
Reddit data is split into three channels, each reflecting a different facet of attention: Level (stable interest), Momentum (current popularity), and Velocity (growth speed).
Level: stable interest (Top/last hour)
Analyzes posts that are in Top over the last hour. Captures established, steady attention to the topic.
level_posts_count- number of Top posts for the topic.level_total_score- total score (upvotes minus downvotes) across those Top posts; captures aggregate positive appraisal.level_total_comments- total comments under the Top posts; reflects discussion level.level_avg_score- average score per Top post; a quality proxy for the average post.
Momentum: current popularity (Hot)
Analyzes Hot posts. Reddit’s Hot algorithm uses both score and post age, so this channel shows what’s popular right now.
momentum_posts_count- number of Hot posts.momentum_total_score- total score across Hot posts; aggregate approval for what’s trending.momentum_total_comments- total comments on Hot posts; breadth of active discussion.momentum_avg_score- average score per Hot post.
Velocity: rising attention (Rising)
Analyzes Rising posts-those gaining votes very quickly. Indicates emerging virality and sharp increases in interest.
velocity_posts_count- number of Rising posts.velocity_total_score- total score across Rising posts; aggregate assessment of what’s taking off.velocity_total_comments- total comments on Rising posts; engagement with newly viral content.velocity_avg_age_hours- average age (hours) of Rising posts; freshness of the viral wave.velocity_max_speed- maximum observed score-growth rate among Rising posts; a key virality indicator.velocity_trimmed_mean- trimmed mean of post-level growth speeds: average growth rate after removing the lowest and highest extremes to stabilize the metric.
YouTube (3 metrics; view-weighted)
For YouTube we use aggregated, view-weighted metrics, so more popular videos contribute more to the final features.
youtube_views— total views across all topic-matched videos in the window; primary reach indicator.youtube_likes(view-weighted) — likes normalized by views, so likes on widely watched videos have greater weight; better reflects engagement quality than raw sums.youtube_comments(view-weighted) — comments normalized by views; reflects depth of engagement rather than raw volume.
X (Twitter) (6 metrics; per-minute averages)
For X we compute per-minute averages across all topic-matched tweets within the window.
retweet_count(avg) - how often content is reshared; a proxy for virality/diffusion.reply_count(avg) - discussion intensity and direct interaction.like_count(avg) - broad approval/interest.quote_count(avg) - deeper engagement where users share content with added commentary.bookmark_count(avg) - “save for later” behavior; perceived usefulness.impression_count(avg) - exposure baseline; platform reach.
From Raw Features → Engagement Index (EI)
Azuro's infra converts mixed metrics into a stable, comparable per-topic EI via the following pipeline:
Global min–max normalization (fixed bounds per metric):
Exponential decay over a 6-hour window with half-life = n (recency-weighted aggregation):
Quantile clipping to suppress tails, then smoothing for micro-noise reduction.
These steps produce the per-minute Engagement Index for each topic.
From EI → DoA (Dollar of Attention)
To make EI directly usable for pricing/P&L in Trendle markets, Azuro's infra applies a linear multiplier (“DoA multiplier”) to EI:
With the current configuration (might be changed), DOA_MULTIPLIER = n, yielding a human-readable, cross-topic comparable DoA scale. Higher DoA = higher current attention; changes in DoA (ΔDoA) drive P&L on Trendle.
How a Spike Propagates (hypothetical micro-walkthrough)
A viral tweet and a breakout YouTube clip appear within minutes; normalized X likes/retweets and YouTube views jump.
Deseasonalization checks whether this hour is usually “hot”; if yes, it deflates the metrics so only excess over typical hour-pattern contributes.
Exponential decay emphasizes very recent activity, so the spike quickly lifts EI.
Quantile clipping limits the impact of outlier posts or videos, ensuring no single extreme value skews results and helping to smooth short-term volatility across nearby time windows.
EI is multiplied by n → DoA ticks up; if the spike persists over several minutes, the DoA level remains elevated until recency-weights roll off.
Index Behavior and Comparability
Unitless EI enables combining heterogeneous signals while preserving directionality (more engagement → higher EI).
The Attention Index (DoA) is NOT normalized to a fixed 1–100 range and has no hard upper bound.
DoA values are cross-topic comparable by design. This comparability is achieved through fixed global normalization bounds for each underlying metric and a common DoA multiplier applied uniformly across all topics. As a result, a DoA value of 100 on one topic represents the same absolute level of measured attention as a DoA value of 100 on any other topic.
DoA levels may exceed 100 during periods of unusually high attention relative to the global baseline. Higher values indicate greater absolute attention intensity, not a percentile ranking or capped score.
Why Multi-Source Attention Is Hard to Manipulate
Trendle’s first line of defense against manipulation is not simply using more data, but using heterogeneous platforms with independent incentive and anti-spam systems. X, Reddit, and YouTube each optimize for different user behaviors, surface content differently, and apply their own moderation and abuse-detection heuristics.
As a result, manipulation costs do not scale linearly with the number of sources. To artificially inflate attention in a way that survives Trendle’s aggregation, an attacker would need to coordinate credible (1), platform-native (2) behavior across multiple ecosystems (3) at the same time (4). This is significantly harder and more expensive than gaming a single platform.
Real attention tends to propagate organically across platforms (discussion appears on Reddit, clips surface on YouTube, and reactions spread on X). Coordinated fake campaigns, by contrast, usually break down on at least one surface, creating inconsistencies that are dampened during normalization and aggregation
In effect, Trendle benefits from the combined anti-spam and incentive structures of each platform, making sustained manipulation economically unattractive
Last updated
