INDEX
    Explanations

    references to human involvement and social aspects in various contexts

    New Auto-Interp
    Negative Logits
    ikel
    -0.14
    umba
    -0.14
    stein
    -0.14
    šil
    -0.14
    thal
    -0.14
    ully
    -0.14
     grap
    -0.14
    iano
    -0.13
    aren
    -0.13
    aven
    -0.13
    POSITIVE LOGITS
    vester
    0.17
    eki
    0.16
    ëłĪ
    0.15
    rew
    0.15
     Affero
    0.14
    atar
    0.14
    lig
    0.14
     vur
    0.14
    ĥ
    0.14
     пеÑĢÑģ
    0.14
    Act Density 0.059%

    No Known Activations