INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     colonel
    0.33
     professor
    0.33
     superintendent
    0.33
     supervisor
    0.32
     senior
    0.31
     commander
    0.30
    uliert
    0.30
     commiss
    0.29
     sergeant
    0.29
    jacobian
    0.29
    POSITIVE LOGITS
     entities
    0.32
     SignUp
    0.31
     Thrive
    0.30
     Tensor
    0.30
     nonprofit
    0.30
     startups
    0.30
     nonprofits
    0.30
     Forums
    0.30
     datasets
    0.29
     формула
    0.29
    Act Density 0.002%

    No Known Activations