INDEX
    Explanations

    punctuation marks, particularly commas and periods

    New Auto-Interp
    Negative Logits
    .scalablytyped
    -0.16
    .aspx
    -0.13
    á»ı
    -0.13
     baÅŁÄ±na
    -0.13
     hızla
    -0.13
    odash
    -0.12
    ùa
    -0.12
    áºŃy
    -0.12
     sahibi
    -0.12
     Smoke
    -0.12
    POSITIVE LOGITS
    ,
    0.41
    .
    0.34
    .↵
    0.26
     ,
    0.22
     and
    0.21
    .↵↵
    0.21
       
    0.21
    0.20
     the
    0.19
    ,↵
    0.17
    Act Density 0.060%

    No Known Activations