INDEX
Explanations
phrases related to responsibility and accountability
New Auto-Interp
Head Attr Weights
0:0.02
1:0.03
2:0.05
3:0.06
4:0.13
5:0.02
6:0.03
7:0.39
8:0.03
9:0.03
10:0.06
11:0.10
Negative Logits
vacations
-1.60
isSpecialOrderable
-1.58
irement
-1.54
invitations
-1.46
ortality
-1.45
izons
-1.44
cathedral
-1.44
vacation
-1.43
soDeliveryDate
-1.43
irements
-1.41
POSITIVE LOGITS
Fever
1.48
ye
1.44
MIS
1.42
mis
1.40
truth
1.37
coy
1.37
whiff
1.34
deen
1.34
izzard
1.33
agging
1.32
Activations Density 0.000%