INDEX
    Explanations

    references to roles or positions within organizations

    New Auto-Interp
    Negative Logits
    erson
    -0.15
    uba
    -0.14
     hom
    -0.14
     kur
    -0.13
    cratch
    -0.13
    resher
    -0.13
     CFG
    -0.13
    lope
    -0.13
     short
    -0.13
    iÄįka
    -0.13
    POSITIVE LOGITS
    jam
    0.15
    itsu
    0.15
     DISP
    0.15
    ãĥĭãĥĥãĤ¯
    0.15
    -widgets
    0.15
    URA
    0.14
     ÙģØ§Ø±
    0.14
    ucken
    0.14
    amba
    0.14
    αÏģα
    0.14
    Act Density 0.147%

    No Known Activations