INDEX
Explanations
references to the Soviet Union
references to the Soviet Union and related entities
New Auto-Interp
Negative Logits
butt
-0.76
trak
-0.74
epad
-0.73
Score
-0.73
miah
-0.72
ients
-0.71
Thom
-0.69
BILL
-0.67
ient
-0.67
gered
-0.67
POSITIVE LOGITS
Union
1.22
Soviet
0.81
dictator
0.81
USSR
0.78
akia
0.78
Pact
0.78
KGB
0.77
bloc
0.77
oslov
0.76
kaya
0.76
Activations Density 0.041%