INDEX
Explanations
personal pronouns related to individuality and collective identity
New Auto-Interp
Negative Logits
either
-0.17
EITHER
-0.16
both
-0.16
æĹ¢
-0.15
etc
-0.15
Both
-0.14
BOTH
-0.14
otos
-0.13
when
-0.13
ucz
-0.13
POSITIVE LOGITS
society
0.27
indeed
0.24
mankind
0.23
anybody
0.22
ourselves
0.22
humanity
0.21
anyone
0.21
vice
0.19
.scalablytyped
0.19
ociety
0.18
Activations Density 0.203%