INDEX
    Explanations

    words and names related to dominance and hierarchy

    New Auto-Interp
    Negative Logits
    mong
    -0.18
    pery
    -0.17
    afil
    -0.15
     Kop
    -0.15
     dup
    -0.15
    bage
    -0.15
    monds
    -0.15
    ilen
    -0.14
    topics
    -0.14
    ëļ
    -0.14
    POSITIVE LOGITS
    estic
    0.25
    åIJĪãĤıãģĽ
    0.15
    posit
    0.15
    á»ı
    0.15
    uko
    0.15
    ologne
    0.14
    ingu
    0.14
    ople
    0.14
    etri
    0.14
    uent
    0.14
    Act Density 0.020%

    No Known Activations