INDEX
Explanations
names and identifiers, including proper nouns and titles
New Auto-Interp
Negative Logits
(
-0.18
ï¼ĪæĺŃåĴĮ
-0.14
Arizona
-0.13
svens
-0.13
behaviours
-0.13
Arizona
-0.13
peon
-0.13
"](
-0.13
(«
-0.12
PN
-0.12
POSITIVE LOGITS
Marx
0.35
Heg
0.31
Marxist
0.25
Marxism
0.24
Lenin
0.23
Kant
0.20
Rousse
0.20
Karl
0.20
liberalism
0.20
Enlightenment
0.20
Activations Density 0.003%