INDEX
Explanations
phrases related to legal actions or criminal activities
references to gender-based violence and its consequences
New Auto-Interp
Negative Logits
soDeliveryDate
-0.81
Devi
-0.71
acci
-0.70
olars
-0.69
Dino
-0.66
Krypt
-0.66
answered
-0.64
©¶æ¥µ
-0.63
Samurai
-0.61
athered
-0.61
POSITIVE LOGITS
knowingly
1.25
materially
1.13
willfully
1.10
unlawfully
1.07
:(
1.04
unlawful
1.00
wil
0.99
willful
0.98
intentionally
0.96
reck
0.95
Activations Density 0.280%