INDEX
    Explanations

    phrases indicating negative or undesirable situations

    statements that emphasize negation or denial

    New Auto-Interp
    Negative Logits
    aneers
    -0.88
    ãĤ¨ãĥ«
    -0.86
    ngth
    -0.84
    anamo
    -0.75
    seys
    -0.74
    ãĥ¼ãĥ³
    -0.74
    alties
    -0.74
    opez
    -0.73
    ãĤ¦ãĤ¹
    -0.73
    dies
    -0.72
    POSITIVE LOGITS
     blat
    0.77
     spac
    0.71
     continuation
    0.71
     generational
    0.70
     happening
    0.70
     whistlebl
    0.69
     STEM
    0.69
     blatant
    0.66
     chance
    0.65
     textbook
    0.64
    Act Density 0.263%

    No Known Activations