INDEX
Explanations
key points or themes related to identity and societal roles
New Auto-Interp
Negative Logits
νια
-0.17
.nih
-0.16
agos
-0.16
OUCH
-0.16
agit
-0.15
UMMY
-0.15
Trot
-0.15
:maj
-0.14
ouch
-0.14
ãģ»ãģĨ
-0.14
POSITIVE LOGITS
anything
0.17
oÄį
0.16
qt
0.16
ever
0.15
EVER
0.15
anymore
0.15
ured
0.15
anything
0.14
anywhere
0.14
Tomorrow
0.14
Activations Density 0.157%