INDEX
    Explanations

    terms associated with serious harm or injury

    New Auto-Interp
    Negative Logits
     bye
    -0.73
     LCS
    -0.61
     Eag
    -0.60
     Tanz
    -0.58
     foam
    -0.58
     Principal
    -0.58
     Finn
    -0.56
     blindly
    -0.55
     Noise
    -0.55
     electronically
    -0.55
    POSITIVE LOGITS
    ous
    2.73
    ously
    2.47
    OUS
    1.57
    osity
    1.38
    iously
    1.34
    istically
    1.30
    istic
    1.28
    ized
    1.22
    ious
    1.22
    izing
    1.22
    Act Density 0.018%

    No Known Activations