INDEX
Explanations
words related to political or military intervention
terms related to intervention in various contexts
New Auto-Interp
Negative Logits
cale
-0.72
WE
-0.70
models
-0.69
λ
-0.68
Boh
-0.66
Bones
-0.65
carbon
-0.64
Export
-0.64
house
-0.63
ndra
-0.63
POSITIVE LOGITS
ateral
1.07
intervene
0.90
anke
0.89
interven
0.88
aries
0.87
intervention
0.87
ulatory
0.85
intervened
0.77
arie
0.76
intervening
0.74
Activations Density 0.033%