INDEX
    Explanations

    negations and phrases expressing disagreement or urging caution

    New Auto-Interp
    Negative Logits
    UniformLocation
    -0.58
     ProtoMessage
    -0.57
    Tracce
    -0.57
    شوف
    -0.48
     aikaa
    -0.47
     pylint
    -0.46
     aapt
    -0.46
     đích
    -0.46
     calendriers
    -0.45
    istolet
    -0.45
    POSITIVE LOGITS
     worry
    1.07
     forget
    1.05
     Worry
    0.81
     underestimate
    0.78
     Forget
    0.76
     fret
    0.75
    worry
    0.75
     waste
    0.73
    Forget
    0.71
     оригіналу
    0.71
    Act Density 0.103%

    No Known Activations