INDEX
    Explanations

    words indicating an action or movement

    New Auto-Interp
    Head Attr Weights
    0:0.08
    1:0.07
    2:0.07
    3:0.06
    4:0.09
    5:0.08
    6:0.08
    7:0.10
    8:0.08
    9:0.07
    10:0.07
    11:0.08
    Negative Logits
     implementation
    -2.39
     seminars
    -2.37
     promotion
    -2.32
     rollout
    -2.30
     monet
    -2.28
     ministries
    -2.27
     Advertising
    -2.24
     referees
    -2.19
     claimants
    -2.16
     implement
    -2.11
    POSITIVE LOGITS
    akable
    2.94
    ciating
    2.63
    acid
    2.54
    ascript
    2.54
    udder
    2.46
    ebted
    2.44
    eely
    2.36
    Brave
    2.35
    zx
    2.33
    Luckily
    2.27
    Act Density 0.000%

    No Known Activations