INDEX
    Explanations

    punctuation marks, particularly periods

    New Auto-Interp
    Negative Logits
    awtextra
    -0.76
    extAlignment
    -0.73
    khid
    -0.71
    Stit
    -0.69
    gjenge
    -0.67
    jectures
    -0.67
    agré
    -0.66
    Hig
    -0.65
     مض
    -0.63
    ぐれ
    -0.63
    POSITIVE LOGITS
    ])).
    1.16
    ()].
    1.15
    $.}
    1.12
    ']").
    1.08
    ]").
    1.05
    __).
    1.04
     }}$.
    1.02
    \.
    0.98
    "]').
    0.98
    ).}
    0.97
    Act Density 0.458%

    No Known Activations