INDEX
    Explanations

    phrases related to countries or communities

    references to societal structures and collective identities

    New Auto-Interp
    Negative Logits
    âĿ
    -0.76
    bots
    -0.72
    mares
    -0.71
     Emails
    -0.70
     Ambro
    -0.70
    uden
    -0.69
    Joy
    -0.67
    Rus
    -0.66
    etz
    -0.65
    xus
    -0.65
    POSITIVE LOGITS
     result
    1.07
     consequence
    1.01
     standalone
    0.98
     spectator
    0.88
     predictor
    0.87
     footballer
    0.84
     member
    0.84
     viable
    0.83
     cohesive
    0.83
     tool
    0.83
    Act Density 0.097%

    No Known Activations