INDEX
Explanations
concepts related to inequality and its impact on society
New Auto-Interp
Negative Logits
orgia
-0.16
chest
-0.15
exped
-0.15
elier
-0.14
erer
-0.14
esso
-0.14
esson
-0.14
sled
-0.14
Ness
-0.14
755
-0.13
POSITIVE LOGITS
blah
0.22
allegedly
0.19
bla
0.17
Ñıк
0.16
hari
0.15
coz
0.15
supposedly
0.15
Ñĸна
0.14
ľ
0.14
etc
0.14
Activations Density 0.382%