INDEX
Explanations
various forms of the verb "to become" and its related phrases
New Auto-Interp
Negative Logits
ERA
-0.17
era
-0.16
zhou
-0.13
sebou
-0.13
SS
-0.13
spat
-0.13
them
-0.13
oor
-0.13
igue
-0.13
onom
-0.13
POSITIVE LOGITS
gota
0.16
ocard
0.15
one
0.15
ardon
0.15
among
0.15
ippo
0.14
lied
0.14
venth
0.14
iveau
0.14
roi
0.14
Activations Density 0.169%