INDEX
    Explanations

    phrases indicating a continuous or repeated behavior

    instances of the word "been" in various contexts

    New Auto-Interp
    Negative Logits
    eers
    -0.77
    rones
    -0.69
    achu
    -0.67
    erity
    -0.65
    opolis
    -0.64
    arta
    -0.64
     Bars
    -0.64
    izable
    -0.64
    odder
    -0.62
    regate
    -0.62
    POSITIVE LOGITS
     able
    1.04
     bitten
    1.03
     seen
    0.97
     taken
    0.96
     given
    0.90
     done
    0.88
     eaten
    0.88
     deemed
    0.85
     shown
    0.84
     beaten
    0.83
    Act Density 0.149%

    No Known Activations