INDEX
Explanations
phrases indicating time-sensitive events or limitations
New Auto-Interp
Head Attr Weights
0:0.04
1:0.02
2:0.13
3:0.19
4:0.08
5:0.02
6:0.17
7:0.03
8:0.07
9:0.05
10:0.08
11:0.07
Negative Logits
udi
-1.40
ievers
-1.18
aunders
-1.10
ogie
-1.10
cknow
-1.09
];
-1.09
Franco
-1.08
/"
-1.07
college
-1.06
contrace
-1.05
POSITIVE LOGITS
downright
1.30
itself
1.27
quite
1.21
marked
1.16
thereafter
1.16
pretty
1.14
oya
1.14
フ
1.14
encia
1.13
initely
1.12
Activations Density 0.090%