INDEX
    Explanations

    pejorative or mocking references to unflattering office behaviors or personalities

    New Auto-Interp
    Negative Logits
    Vice
    -0.53
    ору
    -0.49
    çu
    -0.48
    Ple
    -0.48
     Vice
    -0.47
     oprot
    -0.44
    iprot
    -0.44
    inescence
    -0.43
     \,\
    -0.43
     stab
    -0.42
    POSITIVE LOGITS
     boss
    3.11
    boss
    2.66
     Boss
    2.66
    Boss
    2.56
     bosses
    2.38
     BOSS
    2.09
    BOSS
    1.77
    ボス
    1.13
     chefe
    1.13
     jefe
    1.10
    Act Density 0.002%

    No Known Activations