INDEX
    Explanations

    descriptions of actions or decisions

    instances of the word "move" in various contexts

    New Auto-Interp
    Negative Logits
    omial
    -0.78
    oola
    -0.71
     sqor
    -0.66
    iciency
    -0.65
     Barton
    -0.65
     Cav
    -0.64
    inges
    -0.64
    acha
    -0.63
    icum
    -0.63
    sung
    -0.62
    POSITIVE LOGITS
    able
    0.90
     toward
    0.81
    Motion
    0.81
    backs
    0.80
    ivism
    0.78
    itures
    0.78
    over
    0.76
     towards
    0.76
    ments
    0.75
     forward
    0.75
    Act Density 0.034%

    No Known Activations