INDEX
    Explanations

    proper names, especially those related to film and media

    New Auto-Interp
    Negative Logits
    kova
    -0.15
    lob
    -0.15
    OCR
    -0.15
    .tt
    -0.15
    immers
    -0.14
    à¹Īาย
    -0.14
    apol
    -0.14
    .cr
    -0.14
    260
    -0.14
    aptcha
    -0.14
    POSITIVE LOGITS
    chai
    0.17
     Sesso
    0.16
    zik
    0.15
    stery
    0.14
    æ¡Ī
    0.14
    Canceled
    0.14
     bathing
    0.14
    ãģĹãĤĩ
    0.14
     dressing
    0.14
    ä½ĵç³»
    0.13
    Act Density 0.030%

    No Known Activations