INDEX
    Explanations

    references to personal relationships and social roles

    New Auto-Interp
    Negative Logits
    upo
    -0.17
    antz
    -0.17
    essler
    -0.17
    lak
    -0.17
    uve
    -0.17
     è¦
    -0.16
     FAG
    -0.16
    aggi
    -0.15
    огод
    -0.15
    ufe
    -0.15
    POSITIVE LOGITS
    owing
    0.15
    482
    0.15
    _hook
    0.15
    682
    0.15
    inese
    0.15
    ading
    0.15
    ijn
    0.14
     Voll
    0.14
    ddb
    0.14
    alse
    0.14
    Act Density 0.383%

    No Known Activations