INDEX
Explanations
references to specific years or historical events
New Auto-Interp
Negative Logits
opp
-0.16
crus
-0.16
laisse
-0.15
Flem
-0.15
obar
-0.14
sh
-0.14
.tmp
-0.14
uru
-0.14
Shelby
-0.14
ruz
-0.14
POSITIVE LOGITS
Soviet
0.28
socialist
0.28
Stalin
0.28
Lenin
0.27
Communist
0.26
Cuba
0.25
Socialist
0.25
socialism
0.24
communist
0.24
Party
0.24
Activations Density 0.185%