INDEX
    Explanations

    references to cultural identity and diversity

    New Auto-Interp
    Negative Logits
    orsi
    -0.20
    orsk
    -0.19
    евид
    -0.17
    istol
    -0.17
    odes
    -0.16
    uten
    -0.15
    otten
    -0.15
    fusc
    -0.15
    olan
    -0.14
    ovel
    -0.14
    POSITIVE LOGITS
    nech
    0.16
    rej
    0.15
    ìn
    0.15
    éĢīæĭ
    0.15
    ARGER
    0.15
     Geile
    0.15
    à¥įतव
    0.14
    ìm
    0.14
    ühr
    0.14
    åłĤ
    0.14
    Act Density 0.027%

    No Known Activations