INDEX
    Explanations

    words related to governmental and political themes

    New Auto-Interp
    Negative Logits
    еÑĢеж
    -0.17
    elon
    -0.15
     BITTE
    -0.15
    itchens
    -0.15
    205
    -0.14
    ighter
    -0.14
    apers
    -0.14
    ayd
    -0.14
     lesbische
    -0.14
    ought
    -0.14
    POSITIVE LOGITS
    ål
    0.21
    åde
    0.19
    ät
    0.19
    infeld
    0.16
    ç´
    0.16
     å
    0.16
    askan
    0.16
    addr
    0.15
     Affero
    0.15
    ellan
    0.15
    Act Density 0.025%

    No Known Activations