INDEX
    Explanations

    words related to leadership positions or titles

    New Auto-Interp
    Negative Logits
    phis
    -0.88
    nikov
    -0.88
    lished
    -0.80
    lishes
    -0.79
    ajor
    -0.74
    aughs
    -0.73
    etimes
    -0.71
    Sov
    -0.70
    ppo
    -0.70
    lihood
    -0.69
    POSITIVE LOGITS
     executive
    1.11
    doms
    1.00
     executives
    0.86
     Executive
    0.85
    iary
    0.84
    IAL
    0.78
     negotiator
    0.77
     culprit
    0.74
     rabbi
    0.74
     editor
    0.73
    Act Density 0.051%

    No Known Activations