INDEX
    Explanations

    mentions of particular political parties

    New Auto-Interp
    Negative Logits
    loor
    -0.15
    ook
    -0.14
    PerPixel
    -0.14
    ypo
    -0.14
    +xml
    -0.14
     piè
    -0.14
    portun
    -0.13
    lamaz
    -0.13
    ollar
    -0.13
    ient
    -0.13
    POSITIVE LOGITS
    eguard
    0.17
    arih
    0.15
    ascus
    0.15
    erta
    0.14
    arded
    0.14
    hir
    0.14
    umer
    0.14
    iki
    0.14
    ottom
    0.13
    QA
    0.13
    Act Density 0.008%

    No Known Activations