INDEX
    Explanations

    proper nouns, particularly names and titles

    New Auto-Interp
    Negative Logits
    ulo
    -0.16
    iss
    -0.15
       
    -0.15
    -bodied
    -0.15
    zin
    -0.15
    iability
    -0.15
    led
    -0.15
    lessly
    -0.15
    aska
    -0.14
    zan
    -0.14
    POSITIVE LOGITS
    errat
    0.23
    (mon
    0.19
    oton
    0.19
    gomery
    0.19
    serrat
    0.18
    roe
    0.18
    .Mon
    0.17
    tréal
    0.17
    иÑĤоÑĢ
    0.17
    stery
    0.17
    Act Density 0.050%

    No Known Activations