INDEX
    Explanations

    punctuation marks, specifically periods

    New Auto-Interp
    Negative Logits
    -
    -0.17
    cie
    -0.15
     Mage
    -0.15
     crust
    -0.15
     Ab
    -0.15
    434
    -0.14
     [
    -0.14
     ins
    -0.14
    ?
    -0.14
     ?
    -0.14
    POSITIVE LOGITS
    lemn
    0.17
    .intellij
    0.16
    à¥ĭà¤ļ
    0.16
    avou
    0.15
    åħ
    0.15
    ovit
    0.15
    á»ijc
    0.15
    çĦ¡ãģĹãģ
    0.15
    ĩ
    0.15
    âĻª↵↵
    0.14
    Act Density 0.002%

    No Known Activations