INDEX
Explanations
adjectives and adverbs associated with deceitfulness or fraudulent behavior
New Auto-Interp
Head Attr Weights
0:0.04
1:0.04
2:0.17
3:0.04
4:0.39
5:0.04
6:0.03
7:0.03
8:0.04
9:0.08
10:0.04
11:0.03
Negative Logits
exha
-1.28
weighs
-1.27
vale
-1.27
Skydragon
-1.24
Racer
-1.21
THERE
-1.21
mosqu
-1.21
Stability
-1.21
Ribbon
-1.19
���
-1.19
POSITIVE LOGITS
omission
1.82
tein
1.52
deceit
1.49
deception
1.41
ousy
1.39
olester
1.37
dece
1.34
liar
1.32
dishonest
1.32
gery
1.32
Activations Density 0.011%