INDEX
    Explanations

    references to authority or hierarchy, particularly related to "super" or "supreme."

    New Auto-Interp
    Negative Logits
    toa
    -0.16
    üst
    -0.15
    zes
    -0.15
    ü
    -0.15
    ÄĻk
    -0.14
    ture
    -0.14
    tar
    -0.14
    tier
    -0.14
    tail
    -0.14
    zing
    -0.14
    POSITIVE LOGITS
    erv
    0.23
    posing
    0.22
     sup
    0.22
    posed
    0.22
    erville
    0.21
    reme
    0.20
    ervisor
    0.20
    à¹Ģà¸Ľà¸Ńร
    0.20
    erset
    0.20
    erval
    0.19
    Act Density 0.012%

    No Known Activations