INDEX
    Explanations

    phrases related to semantics and political discourse

    New Auto-Interp
    Negative Logits
    alla
    -0.15
    chalk
    -0.14
    ilar
    -0.13
    ifton
    -0.13
    lik
    -0.13
    ione
    -0.13
     prior
    -0.13
     diam
    -0.13
    rances
    -0.13
     
    -0.13
    POSITIVE LOGITS
     Hell
    0.19
    roulette
    0.18
     hell
    0.18
     kab
    0.16
     equivalent
    0.16
     Gord
    0.16
     Equivalent
    0.16
     Kab
    0.15
     baise
    0.15
    etine
    0.15
    Act Density 0.260%

    No Known Activations