INDEX
    Explanations

    positions of authority or leadership

    New Auto-Interp
    Negative Logits
    anwhile
    -0.98
    itsch
    -0.86
    alon
    -0.84
    tics
    -0.82
     Brist
    -0.82
    affles
    -0.80
    wana
    -0.80
     Carnegie
    -0.79
    ONES
    -0.78
    ulu
    -0.78
    POSITIVE LOGITS
     versa
    1.74
    hum
    1.00
     mate
    0.88
    ners
    0.87
    quel
    0.84
    ned
    0.83
    hal
    0.82
     vice
    0.81
    reg
    0.81
    iously
    0.80
    Act Density 4.382%

    No Known Activations