INDEX
    Explanations

    phrases related to feedback or evaluation

    expressions of frustration or dissatisfaction

    New Auto-Interp
    Negative Logits
    cedented
    -0.61
    imet
    -0.56
     unlawfully
    -0.56
    Enlarge
    -0.56
     unlawful
    -0.53
    ridor
    -0.53
    \":
    -0.53
    interstitial
    -0.52
     ultraviolet
    -0.51
     jointly
    -0.51
    POSITIVE LOGITS
     honestly
    0.97
     anyways
    0.90
    Anyway
    0.85
     however
    0.82
     admittedly
    0.81
     frankly
    0.78
     Anyway
    0.77
     anyway
    0.75
     tho
    0.74
     pity
    0.73
    Act Density 1.008%

    No Known Activations