INDEX
    Explanations

    phrases related to political figures or events

    New Auto-Interp
    Negative Logits
     PB
    -0.70
     RELE
    -0.69
     sid
    -0.68
     WW
    -0.68
     DEC
    -0.66
     OW
    -0.65
     hig
    -0.64
     DIRECT
    -0.63
     ASP
    -0.63
     AUD
    -0.63
    POSITIVE LOGITS
    antes
    1.10
    idation
    1.03
    opia
    1.03
    ansion
    1.02
    idy
    1.01
    orks
    1.01
    ois
    1.00
    gettable
    1.00
    olic
    0.99
    rium
    0.99
    Act Density 0.260%

    No Known Activations