INDEX
Explanations
phrases related to distraction or covering up potential issues
phrases related to distraction techniques or obfuscation strategies
New Auto-Interp
Negative Logits
romy
-0.80
varies
-0.71
®,
-0.70
throp
-0.67
azes
-0.66
uties
-0.66
rones
-0.65
cients
-0.65
molded
-0.65
FTWARE
-0.64
POSITIVE LOGITS
publicity
1.28
embarrassing
1.14
embarrassment
1.12
retribution
1.01
retaliation
0.98
incrim
0.95
discredit
0.94
embarrass
0.94
scapego
0.88
retali
0.87
Activations Density 0.743%