INDEX
Explanations
terms related to control, targeting, and specific actions
terms related to intentional and systematic actions or experiments
New Auto-Interp
Negative Logits
pires
-0.84
aughty
-0.79
orthy
-0.77
asio
-0.74
baum
-0.71
Qiao
-0.70
kj
-0.70
astical
-0.69
Carbuncle
-0.69
esteemed
-0.69
POSITIVE LOGITS
demolition
1.24
elimination
1.24
executions
1.23
deletion
1.21
removal
1.15
assass
1.14
eviction
1.13
destruction
1.13
demol
1.13
killings
1.13
Activations Density 0.226%