INDEX
    Explanations

    phrases related to political figures and events

    references to political figures or discussions about politics

    New Auto-Interp
    Negative Logits
     decomp
    -0.72
    hift
    -0.68
     comr
    -0.66
     JPEG
    -0.64
     cro
    -0.61
     scrim
    -0.61
    ĨĴ
    -0.61
     unaccount
    -0.58
     Mess
    -0.57
     Rivals
    -0.56
    POSITIVE LOGITS
    s
    1.22
    ship
    1.06
    alty
    0.95
    ufact
    0.93
    tarian
    0.89
    sf
    0.88
    tal
    0.88
    gins
    0.87
    ity
    0.87
    sg
    0.86
    Act Density 0.115%

    No Known Activations