INDEX
    Explanations

    negations or contradictions

    negations or phrases that indicate a lack of something

    New Auto-Interp
    Negative Logits
     spor
    -0.66
     indirectly
    -0.65
    tein
    -0.64
     creations
    -0.62
    arts
    -0.61
    Js
    -0.58
     rotated
    -0.57
     oriented
    -0.57
     towed
    -0.56
     jointly
    -0.56
    POSITIVE LOGITS
     unanim
    0.93
    hin
    0.87
    enough
    0.86
     enough
    0.81
    xus
    0.81
     room
    0.80
    ibaba
    0.79
    Enough
    0.77
    ANY
    0.76
     anymore
    0.73
    Act Density 0.070%

    No Known Activations