INDEX
    Explanations

    phrases related to responsibility and accountability

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.03
    2:0.05
    3:0.06
    4:0.13
    5:0.02
    6:0.03
    7:0.39
    8:0.03
    9:0.03
    10:0.06
    11:0.10
    Negative Logits
     vacations
    -1.60
    isSpecialOrderable
    -1.58
    irement
    -1.54
     invitations
    -1.46
    ortality
    -1.45
    izons
    -1.44
     cathedral
    -1.44
     vacation
    -1.43
    soDeliveryDate
    -1.43
    irements
    -1.41
    POSITIVE LOGITS
     Fever
    1.48
    ye
    1.44
     MIS
    1.42
    mis
    1.40
    truth
    1.37
     coy
    1.37
     whiff
    1.34
    deen
    1.34
    izzard
    1.33
    agging
    1.32
    Act Density 0.000%

    No Known Activations