INDEX
Explanations
directional phrases and positional references
New Auto-Interp
Head Attr Weights
0:0.09
1:0.03
2:0.12
3:0.04
4:0.07
5:0.03
6:0.08
7:0.05
8:0.14
9:0.03
10:0.09
11:0.19
Negative Logits
icago
-1.49
ById
-1.39
Divide
-1.39
DonaldTrump
-1.38
unity
-1.35
nels
-1.32
ftime
-1.31
Balance
-1.31
ICO
-1.31
Baltimore
-1.31
POSITIVE LOGITS
unal
1.41
sympathetic
1.29
rehend
1.28
charism
1.26
voice
1.25
interviewed
1.24
stimulating
1.20
extrater
1.19
Luxem
1.19
actors
1.18
Activations Density 0.001%