INDEX
Explanations
references to deception and impersonation in criminal contexts
New Auto-Interp
Negative Logits
isay
-0.20
SPATH
-0.15
weg
-0.15
AdapterManager
-0.15
combe
-0.15
iquer
-0.14
_PCI
-0.14
íĹĮ
-0.14
quest
-0.14
PCI
-0.13
POSITIVE LOGITS
convinc
0.23
convincing
0.22
fake
0.18
è£Ŀ
0.17
/fa
0.17
Fake
0.16
overn
0.16
plausible
0.16
believable
0.16
fake
0.15
Activations Density 0.075%