INDEX
    Explanations

    references to political candidates and their party affiliations

    New Auto-Interp
    Negative Logits
    rece
    -0.17
    ocities
    -0.15
     tw
    -0.15
    inkle
    -0.15
     Fat
    -0.15
    ienne
    -0.15
    Fat
    -0.15
     Trace
    -0.14
    Trace
    -0.14
     تÙĪ
    -0.14
    POSITIVE LOGITS
    rita
    0.16
    žel
    0.15
    ween
    0.15
    585
    0.14
    lient
    0.14
    _nt
    0.14
     Strength
    0.14
    asso
    0.14
    itemap
    0.14
    bject
    0.14
    Act Density 0.085%

    No Known Activations