INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     ultras
    -0.81
     Galileo
    -0.69
     Duterte
    -0.68
     Kazakh
    -0.67
     Sakuya
    -0.67
     Kazakhstan
    -0.66
     unfavorable
    -0.65
     Croatian
    -0.64
     Bulgarian
    -0.64
     globalization
    -0.63
    POSITIVE LOGITS
    ford
    1.23
    worth
    1.18
    ley
    1.18
    burgh
    1.17
    field
    1.15
    beck
    1.15
    ridge
    1.14
    cliffe
    1.12
    herty
    1.11
    ney
    1.11
    Act Density 0.276%

    No Known Activations