INDEX
    Explanations

    concepts related to priorities and values in society

    New Auto-Interp
    Negative Logits
     GOODMAN
    -0.18
    isko
    -0.14
    onas
    -0.13
    iÅŁ
    -0.13
    ria
    -0.13
     Tư
    -0.13
    _DELETED
    -0.12
    _LS
    -0.12
    /tos
    -0.12
    adolu
    -0.12
    POSITIVE LOGITS
    :
    0.53
    ा:
    0.31
    à¹Į:
    0.28
     namely
    0.27
    ï¼ļ
    0.27
     ÛĮعÙĨÛĮ
    0.26
    nam
    0.26
     viz
    0.24
     :
    0.24
    *:
    0.23
    Act Density 0.467%

    No Known Activations