INDEX
Explanations
statements about human nature and societal dynamics
New Auto-Interp
Negative Logits
çĸ
-0.15
undler
-0.15
æĥij
-0.15
ahat
-0.15
ารย
-0.14
azzo
-0.14
arget
-0.14
ubu
-0.14
keleton
-0.14
ümÃ¼ÅŁ
-0.14
POSITIVE LOGITS
wired
0.25
creatures
0.23
nt
0.21
bomb
0.18
attracted
0.17
born
0.17
wiring
0.17
creature
0.16
complex
0.16
Creatures
0.16
Activations Density 0.068%