INDEX
    Explanations

    politically charged language and references to historical events or figures

    New Auto-Interp
    Negative Logits
     Essex
    -0.88
     Hudson
    -0.85
     DAN
    -0.83
    Hudson
    -0.83
     MEL
    -0.80
     Hawkes
    -0.80
    Essex
    -0.77
     Aubrey
    -0.77
    MEL
    -0.77
     Mel
    -0.76
    POSITIVE LOGITS
     Kalam
    0.87
     Jackson
    0.86
     Amin
    0.82
     Jungkook
    0.82
     Coimbatore
    0.78
     Minne
    0.77
     Jimin
    0.76
    Jackson
    0.74
     JACKSON
    0.74
     Ackerman
    0.73
    Act Density 1.536%

    No Known Activations