INDEX
Explanations
negations and qualifiers that express low certainty or limitations
New Auto-Interp
Head Attr Weights
0:0.02
1:0.08
2:0.13
3:0.03
4:0.02
5:0.04
6:0.16
7:0.06
8:0.18
9:0.12
10:0.06
11:0.04
Negative Logits
ameron
-1.15
ials
-1.08
alks
-1.02
bush
-1.01
endings
-0.99
actions
-0.99
raz
-0.98
rez
-0.97
ival
-0.96
ATIONS
-0.95
POSITIVE LOGITS
orthy
1.16
otherwise
1.15
pires
1.14
aest
1.08
entimes
1.08
geographically
1.08
wearer
1.03
pport
1.03
Depend
1.03
pire
1.01
Activations Density 0.210%