INDEX
    Explanations

    words related to social or political issues, particularly related to culture, politics, and legislation

    New Auto-Interp
    Negative Logits
    Denote
    -0.55
     mirador
    -0.54
     fortæ
    -0.52
    ilangkan
    -0.49
    bahaya
    -0.48
     lcm
    -0.48
     barcel
    -0.47
     revisa
    -0.47
    uklu
    -0.46
    ropshire
    -0.46
    POSITIVE LOGITS
     been
    0.71
     tats
    0.68
     persino
    0.68
     zyn
    0.65
     stockholm
    0.65
     blos
    0.64
     ridu
    0.64
     vry
    0.64
     BEEN
    0.64
     affez
    0.63
    Act Density 0.479%

    No Known Activations