INDEX
Explanations
terms related to crime and deception
New Auto-Interp
Negative Logits
kles
-0.14
Carrier
-0.14
bouquet
-0.14
hoÃłng
-0.14
Collapsed
-0.13
Chance
-0.13
erte
-0.13
decess
-0.13
åĿ¦
-0.13
599
-0.13
POSITIVE LOGITS
targeting
0.28
target
0.26
intent
0.23
target
0.21
targets
0.19
attack
0.19
arget
0.18
loose
0.18
Target
0.18
intent
0.17
Activations Density 0.334%