INDEX
    Explanations

    mentions of specific political figures

    New Auto-Interp
    Negative Logits
    ast
    -0.06
    oÅĻ
    -0.06
    aster
    -0.06
    q
    -0.06
    Ñĸб
    -0.06
    stead
    -0.06
    old
    -0.05
    \Migrations
    -0.05
    unga
    -0.05
    469
    -0.05
    POSITIVE LOGITS
    ^K
    0.07
    ednou
    0.07
    arges
    0.07
    .are
    0.07
    rones
    0.07
    inned
    0.07
    aldi
    0.07
    é¡Ķ
    0.07
    DCF
    0.07
    achel
    0.07
    Act Density 0.000%

    No Known Activations