INDEX
    Explanations

    references to soreness or discomfort

    New Auto-Interp
    Negative Logits
    ftagPool
    -0.45
     للمعارف
    -0.44
    sweise
    -0.43
    modelBuilder
    -0.43
    FontOfSize
    -0.41
    NameInMap
    -0.40
     Exp
    -0.40
    GTCX
    -0.39
     Confer
    -0.39
    ように
    -0.38
    POSITIVE LOGITS
    Sore
    0.75
     Sore
    0.71
    sore
    0.69
     sore
    0.64
     soreness
    0.57
    DebuggerNonUser
    0.56
     Sorensen
    0.50
    throat
    0.48
     saira
    0.48
     throat
    0.45
    Act Density 0.001%

    No Known Activations