INDEX
Explanations
references to war or military conflicts
war or Anwar
New Auto-Interp
Negative Logits
Cecilia
-0.54
BASEPATH
-0.52
Rico
-0.51
PickerController
-0.50
oulli
-0.49
Cecilia
-0.48
tine
-0.47
濟
-0.46
cipline
-0.46
Lucio
-0.46
POSITIVE LOGITS
war
2.83
WAR
2.11
war
2.03
War
1.87
WAR
1.80
War
1.75
wars
1.62
wara
1.27
wars
1.25
guerra
1.25
Activations Density 0.009%