INDEX
    Explanations

    instances of words that reveal political or social issues

    New Auto-Interp
    Negative Logits
     surpr
    -0.53
    CLASSIFIED
    -0.50
    NOW
    -0.49
    tal
    -0.49
    llor
    -0.48
    Voice
    -0.48
    roth
    -0.47
    rolet
    -0.46
    emo
    -0.46
    henko
    -0.46
    POSITIVE LOGITS
     lieu
    1.15
     accordance
    1.07
     favor
    0.95
     conjunction
    0.94
     vitro
    0.88
     order
    0.88
     favour
    0.87
     regards
    0.87
    efficiency
    0.87
    ordinate
    0.86
    Act Density 8.165%

    No Known Activations