INDEX
    Explanations

    proper nouns related to individuals, possibly in a news or political context

    New Auto-Interp
    Negative Logits
    terday
    -0.74
    EED
    -0.72
     conclud
    -0.71
    eele
    -0.64
    anwhile
    -0.63
    Reference
    -0.61
    Sax
    -0.59
    ENTS
    -0.58
    henko
    -0.58
    ateral
    -0.57
    POSITIVE LOGITS
    rique
    1.19
    ning
    1.05
    riks
    0.95
    rik
    0.94
    sel
    0.93
    lein
    0.92
    agar
    0.92
    nery
    0.90
    ricks
    0.88
    ners
    0.88
    Act Density 6.595%

    No Known Activations