INDEX
Explanations
instances of the word "watch" followed by some action
phrases related to watching videos or live broadcasts
New Auto-Interp
Negative Logits
insula
-0.71
wx
-0.66
Syl
-0.66
rette
-0.65
illet
-0.65
reusable
-0.65
omorphic
-0.63
ascal
-0.63
isma
-0.62
âĵĺ
-0.62
POSITIVE LOGITS
VIDEOS
1.05
unfold
0.99
flix
0.83
ideos
0.80
broadcasts
0.78
tapes
0.77
runs
0.75
cartoons
0.75
videos
0.74
films
0.72
Activations Density 0.140%