INDEX
Explanations
calls to action related to watching videos or shows
New Auto-Interp
Negative Logits
bun
-0.70
amalg
-0.57
turf
-0.57
nib
-0.56
assum
-0.56
recl
-0.55
barley
-0.55
bisc
-0.55
peanuts
-0.54
clay
-0.54
POSITIVE LOGITS
WATCH
0.87
VIDEOS
0.73
Watching
0.70
Videos
0.68
esome
0.67
dog
0.65
View
0.64
Sasha
0.63
...]
0.61
Canaveral
0.61
Activations Density 0.001%