INDEX
    Explanations

    political references and themes

    New Auto-Interp
    Negative Logits
    ury
    -0.18
    eus
    -0.17
    mind
    -0.16
    ected
    -0.15
    eum
    -0.15
    uded
    -0.14
     Ston
    -0.14
    oÄŁ
    -0.14
    eil
    -0.14
    uali
    -0.14
    POSITIVE LOGITS
    ere
    0.22
    heits
    0.22
    heid
    0.20
    heit
    0.18
    hei
    0.18
    es
    0.17
    este
    0.16
    erer
    0.16
     orient
    0.16
    olan
    0.16
    Act Density 0.049%

    No Known Activations