INDEX
Explanations
passive constructions in sentences
New Auto-Interp
Head Attr Weights
0:0.08
1:0.05
2:0.08
3:0.09
4:0.08
5:0.08
6:0.08
7:0.07
8:0.09
9:0.08
10:0.09
11:0.08
Negative Logits
manifest
-1.89
prevail
-1.89
advoc
-1.83
affecting
-1.76
favor
-1.75
impacting
-1.73
ration
-1.73
conserve
-1.73
bold
-1.71
favour
-1.71
POSITIVE LOGITS
ioch
1.86
76561
1.86
XT
1.84
Bridgewater
1.79
ibrary
1.79
ector
1.77
HAM
1.76
NES
1.72
ENE
1.71
OX
1.67
Activations Density 0.000%