INDEX
Explanations
references to war and its associated narratives
New Auto-Interp
Negative Logits
ael
-0.17
erd
-0.15
wildcard
-0.15
å¦ĥ
-0.15
weave
-0.15
739
-0.15
Ïħκ
-0.15
Worldwide
-0.14
Weights
-0.14
watchdog
-0.14
POSITIVE LOGITS
war
0.66
-war
0.62
War
0.61
war
0.57
War
0.56
WAR
0.55
_war
0.50
guerra
0.47
WAR
0.47
æĪĺäºī
0.43
Activations Density 0.127%