INDEX
    Explanations

    phrases related to actions that involve influence, control, or response

    words indicating actions or intentions

    New Auto-Interp
    Negative Logits
    abiding
    -0.85
    afety
    -0.75
    answered
    -0.67
    76561
    -0.66
    ById
    -0.66
     apologised
    -0.66
    checking
    -0.64
     Died
    -0.63
    cens
    -0.63
    breeding
    -0.63
    POSITIVE LOGITS
     elevate
    0.91
     begin
    0.91
     accelerate
    0.91
     unleash
    0.89
     extend
    0.88
     bring
    0.85
     widen
    0.85
     seize
    0.83
     expand
    0.83
     plunge
    0.83
    Act Density 0.490%

    No Known Activations