INDEX
Explanations
instances of targeting, threats, and assassination in various contexts
New Auto-Interp
Negative Logits
abestanden
-0.56
setupUi
-0.55
σια
-0.52
roxene
-0.52
esModule
-0.50
pastas
-0.50
ETING
-0.49
Verkehr
-0.49
Citiți
-0.49
createComponent
-0.49
POSITIVE LOGITS
attacking
1.11
targeting
1.09
targets
1.08
attack
1.03
attacks
0.99
targeted
0.95
target
0.94
attacked
0.94
Targeting
0.93
Targeting
0.92
Activations Density 0.501%