INDEX
    Explanations

    phrases expressing opinions or beliefs

    assertive statements of personal opinion or belief

    New Auto-Interp
    Negative Logits
    ãĤ´ãĥ³
    -0.76
    éĹĺ
    -0.76
     unavailable
    -0.74
    è¦ļéĨĴ
    -0.70
    éŃĶ
    -0.69
    ãĥīãĥ©
    -0.67
    alid
    -0.67
    none
    -0.63
    iann
    -0.63
    ãĤº
    -0.62
    POSITIVE LOGITS
     underrated
    1.11
     underest
    0.96
     underestimate
    0.93
     deserved
    0.93
     underestimated
    0.92
     fair
    0.89
     ought
    0.89
     overest
    0.88
     beh
    0.87
     unfairly
    0.87
    Act Density 0.327%

    No Known Activations