INDEX
    Explanations

    proper nouns or specific names, particularly related to websites or brands

    the presence of specific special characters or placeholders

    New Auto-Interp
    Negative Logits
    ĸļ
    -0.86
    terday
    -0.70
    oÄŁ
    -0.65
     compr
    -0.64
     derail
    -0.63
     retrospect
    -0.63
     htt
    -0.62
    peer
    -0.59
     parted
    -0.59
     craw
    -0.59
    POSITIVE LOGITS
    roups
    1.30
    raphic
    1.18
    iants
    1.16
    AMES
    1.12
    iant
    1.11
    RAY
    1.11
    reetings
    1.10
    ossip
    1.09
    uild
    1.08
    ourmet
    1.08
    Act Density 0.035%

    No Known Activations