INDEX
    Explanations

    political figures and organizations

    names of people and organizations

    New Auto-Interp
    Negative Logits
    enment
    -0.58
     Debor
    -0.53
     denomin
    -0.52
    agher
    -0.51
     attest
    -0.51
    wealth
    -0.51
     toile
    -0.48
     sugg
    -0.48
    .$
    -0.46
    cms
    -0.45
    POSITIVE LOGITS
    atism
    0.54
    cheat
    0.49
    ropri
    0.48
    ahu
    0.47
    acial
    0.47
    seless
    0.47
     shouldn
    0.46
     should
    0.46
    hematically
    0.45
    bably
    0.45
    Act Density 1.208%

    No Known Activations