INDEX
    Explanations

    terms related to political topics and entities

    New Auto-Interp
    Negative Logits
    ertas
    -0.16
    iyet
    -0.16
    enting
    -0.16
    uteur
    -0.15
    anja
    -0.15
    zej
    -0.14
    ónico
    -0.14
    511
    -0.14
    otland
    -0.14
    äll
    -0.14
    POSITIVE LOGITS
    icians
    0.28
    ician
    0.23
     correct
    0.23
    correct
    0.21
    ical
    0.21
     Correct
    0.21
    ically
    0.20
     incorrect
    0.20
    ICS
    0.20
    ifact
    0.19
    Act Density 0.007%

    No Known Activations