INDEX
Explanations
references to past connections or relationships with someone or something
references to previous roles or statuses of individuals
New Auto-Interp
Negative Logits
achus
-0.98
otle
-0.91
anguage
-0.83
andise
-0.81
onso
-0.76
antics
-0.75
allery
-0.74
ourced
-0.73
ghai
-0.73
acht
-0.72
POSITIVE LOGITS
Yugoslavia
0.96
Yugoslav
0.95
Soviet
0.82
captives
0.78
comrade
0.76
dictator
0.76
Lum
0.75
President
0.75
wartime
0.75
comrades
0.75
Activations Density 0.029%