INDEX
    Explanations

    phrases related to physical distance or separation

    instances of the word "away"

    New Auto-Interp
    Negative Logits
    efe
    -0.70
     Chung
    -0.66
     sshd
    -0.65
    ured
    -0.62
     Ellison
    -0.62
    emort
    -0.61
     Perkins
    -0.61
    ãĤ³
    -0.60
     Butterfly
    -0.60
    erity
    -0.60
    POSITIVE LOGITS
    fitting
    0.75
    ments
    0.72
    coming
    0.71
    rent
    0.69
     leagues
    0.68
    world
    0.68
    away
    0.68
    fits
    0.67
    ILA
    0.67
    irts
    0.66
    Act Density 0.047%

    No Known Activations