INDEX
    Explanations

    phrases that express denial or contradiction

    New Auto-Interp
    Negative Logits
     variation
    -0.09
     Vari
    -0.09
     Variation
    -0.09
    Vari
    -0.08
    itra
    -0.08
     variations
    -0.08
    ibold
    -0.08
     variants
    -0.08
    variation
    -0.07
    versions
    -0.07
    POSITIVE LOGITS
     claim
    0.08
     intended
    0.07
     intent
    0.07
     intends
    0.07
     aim
    0.07
     meant
    0.07
     intend
    0.07
     intending
    0.07
    kins
    0.06
     zoekt
    0.06
    Act Density 0.035%

    No Known Activations