INDEX
Explanations
mentions of wars and conflicts
phrases referring to wars and conflicts, particularly historical events
New Auto-Interp
Negative Logits
©¶æ¥µ
-0.77
*/(
-0.72
Helpful
-0.68
©¶æ
-0.65
Vander
-0.63
nce
-0.63
nailed
-0.63
phia
-0.62
çīĪ
-0.62
TOP
-0.61
POSITIVE LOGITS
enegger
1.05
torn
0.90
raged
0.89
attrition
0.83
Afghanistan
0.78
schild
0.77
istan
0.75
waged
0.75
milit
0.74
Armageddon
0.74
Activations Density 0.195%