INDEX
Explanations
concepts related to social progress and historical ideology
New Auto-Interp
Negative Logits
omor
-0.15
ALSE
-0.15
kü
-0.15
ubat
-0.15
avax
-0.15
ÏħÏĩ
-0.14
mặt
-0.14
оÑĢоÑĤ
-0.14
elm
-0.14
democrat
-0.14
POSITIVE LOGITS
Locke
0.29
Hob
0.28
Lock
0.28
Lock
0.24
Bent
0.23
Burke
0.22
.Lock
0.21
Adam
0.21
Raw
0.20
Mill
0.20
Activations Density 0.095%