INDEX
Explanations
concepts related to equality and dignity
New Auto-Interp
Negative Logits
882
-0.15
apan
-0.15
Mellon
-0.14
Jaune
-0.14
Underground
-0.14
ocz
-0.14
281
-0.14
intptr
-0.14
/icons
-0.14
incoming
-0.13
POSITIVE LOGITS
human
0.31
dignity
0.31
Human
0.28
human
0.25
Human
0.25
dign
0.25
worth
0.24
equal
0.22
UMAN
0.21
-human
0.20
Activations Density 0.147%