INDEX
Explanations
actions and high-intensity sequences in narratives
New Auto-Interp
Negative Logits
addock
-0.17
okie
-0.17
ansson
-0.14
reater
-0.14
rous
-0.14
ongan
-0.14
ture
-0.14
gow
-0.14
WXYZ
-0.14
bery
-0.13
POSITIVE LOGITS
yk
0.16
-packed
0.14
ais
0.14
mil
0.14
ypy
0.14
Brend
0.14
nea
0.14
adio
0.13
mailer
0.13
Klopp
0.13
Activations Density 0.015%