INDEX
Explanations
news headlines indicating a video has been watched, typically paired with a phrase in all caps
instances of the word "watched" and its variations, indicating a focus on viewing or media consumption
New Auto-Interp
Negative Logits
etheless
-0.87
genre
-0.81
appropri
-0.79
soDeliveryDate
-0.75
fortunately
-0.75
ional
-0.72
ococ
-0.69
comprom
-0.69
aten
-0.69
brainer
-0.69
POSITIVE LOGITS
Highlights
1.23
Inside
1.08
How
1.07
Transcript
1.07
Why
1.06
Timeline
1.06
What
1.05
WATCH
1.05
Recap
1.02
Coverage
1.01
Activations Density 0.027%