INDEX
Explanations
casual second-person plural pronouns and references to a group
New Auto-Interp
Negative Logits
lope
-0.17
423
-0.16
ifer
-0.16
himself
-0.16
ini
-0.15
indeed
-0.14
cente
-0.14
irl
-0.14
itself
-0.14
Himself
-0.14
POSITIVE LOGITS
dio
0.16
äºķ
0.15
·
0.15
arkin
0.15
Assignable
0.15
Ages
0.14
412
0.14
/MIT
0.14
VERR
0.14
limp
0.14
Activations Density 0.034%