INDEX
    Explanations

    references to government and political figures

    New Auto-Interp
    Negative Logits
    ROTO
    -0.09
    atem
    -0.08
    eldom
    -0.08
    eldorf
    -0.08
    bbe
    -0.07
    .vaadin
    -0.07
    grily
    -0.07
    radan
    -0.07
    ikal
    -0.07
    redd
    -0.07
    POSITIVE LOGITS
     exp
    0.06
     federal
    0.05
     Fluent
    0.05
     NAS
    0.05
     demon
    0.05
     predicate
    0.05
     stripe
    0.05
     Liz
    0.05
    åºŃ
    0.05
    iyel
    0.05
    Act Density 0.092%

    No Known Activations