INDEX
Explanations
news headlines or titles containing the word "WATCH"
instances of video content or specific watch-related cues
New Auto-Interp
Negative Logits
Shape
-0.72
ound
-0.64
abase
-0.62
Sin
-0.62
Inher
-0.60
aries
-0.57
inherit
-0.57
delinqu
-0.56
acle
-0.56
erk
-0.55
POSITIVE LOGITS
WATCH
3.94
VIDEO
2.14
LIST
1.42
BRE
1.36
READ
1.24
WATCH
1.24
PHOTOS
1.22
TIME
1.19
RELATED
1.16
CHECK
1.13
Activations Density 0.028%