INDEX
    Explanations

    punctuation marks that indicate questions and exclamations

    New Auto-Interp
    Negative Logits
     nearby
    -0.16
     Indeed
    -0.16
     Meanwhile
    -0.16
    Meanwhile
    -0.16
     Dit
    -0.15
     sez
    -0.15
     indeed
    -0.15
    Similarly
    -0.14
     Similarly
    -0.13
    writes
    -0.13
    POSITIVE LOGITS
    ITS
    0.29
    its
    0.26
     Lets
    0.25
    Lets
    0.25
     ITS
    0.25
     its
    0.25
     Its
    0.24
    Its
    0.24
     Majority
    0.20
    lets
    0.20
    Act Density 0.592%

    No Known Activations