INDEX
    Explanations

    positive and negative evaluations emphasizing contrast and comparison

    contradictory statements or contrasting phrases in evaluations or reviews

    New Auto-Interp
    Negative Logits
    reen
    -0.74
    dayName
    -0.73
    éĹĺ
    -0.73
    nan
    -0.72
    llah
    -0.72
    successfully
    -0.72
    vance
    -0.70
    shore
    -0.70
    itched
    -0.68
    si
    -0.66
    POSITIVE LOGITS
     beware
    1.07
     alas
    1.02
     unfortunately
    1.02
     downside
    0.90
     hindered
    0.88
     drawbacks
    0.87
     sadly
    0.81
     hampered
    0.81
     pitfalls
    0.81
     compromises
    0.80
    Act Density 0.357%

    No Known Activations