INDEX
Explanations
mentions of actions or discussions regarding interventions, especially in political or conflict contexts
instances of the word "intervention" and related terms
New Auto-Interp
Negative Logits
Export
-0.75
carbon
-0.74
cale
-0.73
WE
-0.73
λ
-0.72
Boh
-0.70
\\\\\\\\
-0.65
ndra
-0.64
versions
-0.64
cake
-0.64
POSITIVE LOGITS
ateral
1.02
aries
0.97
anke
0.92
interven
0.91
arie
0.87
intervention
0.87
arial
0.86
ulatory
0.80
ist
0.79
intervene
0.79
Activations Density 0.018%