INDEX
Explanations
video-related terms and actions
instances of the word "play" and its variations
New Auto-Interp
Negative Logits
fty
-0.87
isse
-0.64
urus
-0.64
irl
-0.62
Beir
-0.60
########
-0.60
reach
-0.59
kos
-0.59
arger
-0.59
pora
-0.59
POSITIVE LOGITS
ername
1.02
lists
1.01
wright
0.92
clips
0.82
aloud
0.80
recordings
0.80
havoc
0.80
Piano
0.76
catch
0.76
slideshow
0.76
Activations Density 0.053%