INDEX
    Explanations

    references to specific cultural or ethnic identities

    New Auto-Interp
    Negative Logits
    itesse
    -0.16
    licht
    -0.16
    ãĤį
    -0.14
    tainment
    -0.14
    egin
    -0.14
    aur
    -0.14
    aç
    -0.14
     monarch
    -0.14
    ivate
    -0.13
    auen
    -0.13
    POSITIVE LOGITS
    ardo
    0.18
    zelf
    0.18
    lops
    0.17
    lopedia
    0.17
    otope
    0.16
    ãĢħ
    0.15
    starter
    0.15
    cing
    0.15
    coli
    0.15
     thÆ°á»Ľc
    0.15
    Act Density 0.081%

    No Known Activations