INDEX
Explanations
phrases related to deception or hidden motives
New Auto-Interp
Head Attr Weights
0:0.05
1:0.02
2:0.09
3:0.17
4:0.09
5:0.04
6:0.06
7:0.04
8:0.04
9:0.07
10:0.14
11:0.13
Negative Logits
juven
-1.34
inexperienced
-1.31
depreciation
-1.29
inver
-1.27
ejac
-1.25
isot
-1.19
inexper
-1.18
athlet
-1.18
outright
-1.16
disposition
-1.16
POSITIVE LOGITS
');
1.51
';
1.37
').
1.35
ffic
1.29
BLIC
1.29
united
1.27
"}],"
1.25
thouse
1.24
icates
1.23
official
1.21
Activations Density 0.000%