INDEX
    Explanations

    mentions of political figures and government bodies

    references to political entities, figures, and related terminology

    New Auto-Interp
    Negative Logits
     Brow
    -0.49
     seismic
    -0.48
    imensional
    -0.48
     Era
    -0.48
    Merit
    -0.48
     vortex
    -0.47
     Place
    -0.47
     Amen
    -0.47
    oven
    -0.47
     Filipino
    -0.47
    POSITIVE LOGITS
    tracks
    0.68
    glers
    0.67
    milo
    0.67
    imaru
    0.63
    issance
    0.63
     alike
    0.62
    enance
    0.60
    retty
    0.59
    stice
    0.57
    itiz
    0.57
    Act Density 0.881%

    No Known Activations