INDEX
Explanations
references to the Cold War
references to the Cold War
New Auto-Interp
Negative Logits
enance
-0.78
confir
-0.68
ript
-0.66
fined
-0.66
chall
-0.64
offic
-0.63
NPR
-0.61
authorized
-0.61
celebr
-0.59
rawdownloadcloneembedreportprint
-0.59
POSITIVE LOGITS
War
1.06
eties
0.91
achine
0.87
War
0.85
idge
0.84
Era
0.83
ridge
0.82
Ages
0.82
Wars
0.79
erton
0.78
Activations Density 0.029%