INDEX
    Explanations

    statements about opinions or thoughts

    expressions of personal opinion or beliefs

    New Auto-Interp
    Negative Logits
    ãĥīãĥ©
    -0.72
    arthed
    -0.71
     reportedly
    -0.68
     allegedly
    -0.68
    atars
    -0.62
    uid
    -0.62
    lict
    -0.61
    isi
    -0.59
    ROR
    -0.58
    éĹĺ
    -0.58
    POSITIVE LOGITS
     underestimate
    0.86
     underest
    0.83
     underestimated
    0.80
     miscon
    0.79
     horm
    0.73
     misunder
    0.73
     overest
    0.73
     underrated
    0.71
     misconception
    0.70
     somew
    0.70
    Act Density 0.487%

    No Known Activations