INDEX
    Explanations

    proper nouns or personal names

    mentions of a specific individual, likely a notable person in sports

    New Auto-Interp
    Negative Logits
    terday
    -0.79
    EED
    -0.79
     conclud
    -0.70
    eele
    -0.68
    anwhile
    -0.67
    ignment
    -0.60
    align
    -0.59
    WAYS
    -0.59
    ateral
    -0.59
    enegger
    -0.58
    POSITIVE LOGITS
    rique
    1.35
    ning
    1.07
    rik
    1.04
    riks
    1.03
    nery
    0.99
    sel
    0.98
    lein
    0.98
    ricks
    0.95
    ric
    0.93
    lund
    0.92
    Act Density 0.029%

    No Known Activations