INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     AST
    -0.07
     prosperous
    -0.07
     LY
    -0.06
    کر
    -0.06
    -serif
    -0.06
    Serial
    -0.06
     allied
    -0.06
     INTO
    -0.06
     BOT
    -0.06
    Jimmy
    -0.06
    POSITIVE LOGITS
     would
    0.08
    ModelIndex
    0.07
    --
    ↵
    0.07
    atement
    0.06
    walking
    0.06
     birthday
    0.06
     ihr
    0.06
    .fragment
    0.06
    activation
    0.06
     Invent
    0.06
    Act Density 0.030%

    No Known Activations