INDEX
    Explanations

    instances of special characters and punctuation in text

    New Auto-Interp
    Negative Logits
    794
    -0.16
    iran
    -0.16
    ÑijÑĢ
    -0.15
    олод
    -0.14
    aldo
    -0.14
    erged
    -0.14
     Tan
    -0.14
    arbon
    -0.14
    iel
    -0.14
    astered
    -0.14
    POSITIVE LOGITS
    onda
    0.15
    rian
    0.14
    鹿
    0.14
    urette
    0.14
     Vers
    0.14
     disp
    0.14
    _logits
    0.13
    bies
    0.13
    cles
    0.13
     Hed
    0.13
    Act Density 0.003%

    No Known Activations