INDEX
    Explanations

    related to feedback requests and communication

    The token "Ben" or similar variations

    words starting with "Ben"

    New Auto-Interp
    Negative Logits
     Aer
    -0.57
     Ogre
    -0.56
     Sod
    -0.55
     Somers
    -0.55
     Miroslav
    -0.54
     Cuer
    -0.54
     Carroll
    -0.53
     Asi
    -0.52
     Pele
    -0.52
    Aer
    -0.51
    POSITIVE LOGITS
     Ben
    3.23
    Ben
    3.03
     BEN
    2.84
     ben
    2.58
    BEN
    2.54
    ben
    2.39
     Benjamin
    2.04
     Бен
    2.00
     Bens
    1.90
    Benjamin
    1.89
    Act Density 0.133%

    No Known Activations