INDEX
    Explanations

    phrases related to workplace safety and communication

    New Auto-Interp
    Negative Logits
     shenan
    -1.65
     unspeak
    -1.53
     disagre
    -1.49
     maneu
    -1.47
     hairc
    -1.45
     unwarran
    -1.45
     horrend
    -1.42
     apprehen
    -1.40
     affor
    -1.39
     impra
    -1.39
    POSITIVE LOGITS
    <bos>
    1.23
     ***!
    0.82
    ↵↵
    0.81
    ']."
    0.78
    ↵↵↵
    0.77
    <eos>
    0.77
    __))
    0.76
    ferrer
    0.75
    }}
    0.74
    .”
    0.74
    Act Density 0.151%

    No Known Activations