INDEX
Explanations
references to locations or positions
New Auto-Interp
Head Attr Weights
0:0.04
1:0.02
2:0.04
3:0.24
4:0.02
5:0.06
6:0.01
7:0.05
8:0.03
9:0.01
10:0.40
11:0.02
Negative Logits
unsuccessful
-2.77
unsuccessfully
-2.55
difficulties
-2.42
differently
-2.29
successive
-2.26
fewer
-2.26
deteriorating
-2.25
improved
-2.24
protracted
-2.21
strengthened
-2.15
POSITIVE LOGITS
behind
3.17
pedia
2.40
inside
2.25
yours
2.21
cradle
2.18
"...
2.18
Behind
2.16
!!!!
2.15
ahead
2.12
guy
2.06
Activations Density 0.036%