INDEX
Explanations
instances of lying or deception
instances of deception or dishonesty, particularly involving lying to authorities or the public
New Auto-Interp
Negative Logits
eric
-0.72
winner
-0.67
pour
-0.67
specialization
-0.63
guiActiveUnfocused
-0.62
Interstitial
-0.62
pora
-0.62
Radius
-0.62
procession
-0.62
ateur
-0.60
POSITIVE LOGITS
omission
1.03
deceive
0.81
falsely
0.79
uth
0.79
perjury
0.78
testifying
0.78
dece
0.77
incrim
0.77
truth
0.73
911
0.73
Activations Density 0.287%