INDEX
    Explanations

    punctuation marks and sentence-ending symbols

    New Auto-Interp
    Negative Logits
    Č
    -0.16
    atial
    -0.14
    edb
    -0.14
    yasal
    -0.13
    ibrated
    -0.13
    ;;;;;;
    -0.13
    IDS
    -0.13
    APP
    -0.13
    n
    -0.13
     _↵↵
    -0.12
    POSITIVE LOGITS
     })(
    0.19
     ),
    0.17
     ",
    0.16
    gem
    0.16
    eks
    0.15
     )[
    0.15
    eval
    0.15
     },{
    0.15
    rips
    0.15
    ÏĥÏī
    0.14
    Act Density 0.093%

    No Known Activations