INDEX
    Explanations

    verbs and phrases indicating intentional or deliberate actions

    New Auto-Interp
    Head Attr Weights
    0:0.08
    1:0.03
    2:0.15
    3:0.08
    4:0.17
    5:0.11
    6:0.03
    7:0.02
    8:0.09
    9:0.11
    10:0.05
    11:0.03
    Negative Logits
    anian
    -1.46
    shire
    -1.35
     Sirius
    -1.32
    insula
    -1.32
    ilion
    -1.31
    yip
    -1.27
    LI
    -1.25
    Score
    -1.22
    cells
    -1.20
    asia
    -1.18
    POSITIVE LOGITS
     deceived
    1.26
     unlawful
    1.22
     unethical
    1.16
     manslaughter
    1.16
    iazep
    1.16
     forbidden
    1.15
     gou
    1.14
     prohibited
    1.11
    pelled
    1.11
     harm
    1.11
    Act Density 0.019%

    No Known Activations