INDEX
    Explanations

    symbols and punctuation marks that indicate structure in text

    New Auto-Interp
    Negative Logits
    ione
    -0.16
    abd
    -0.16
    —↵↵
    -0.15
     (↵↵
    -0.14
    енÑģ
    -0.14
    äm
    -0.14
    ije
    -0.14
    Ỽ
    -0.14
    ÏĮ
    -0.14
    ios
    -0.13
    POSITIVE LOGITS
     âĢ¢
    0.32
    âĢ¢
    0.27
     ãĥ»
    0.23
    ·
    0.22
    bullet
    0.22
     ·
    0.21
    :↵
    0.21
    ãĥ»
    0.20
     âĹı
    0.20
    âĹı
    0.20
    Act Density 0.121%

    No Known Activations