INDEX
    Explanations

    sentences expressing strong emotions or opinions

    repetitive statements and assertions

    New Auto-Interp
    Negative Logits
    esm
    -0.78
    elve
    -0.73
    dozen
    -0.71
    styles
    -0.70
    aneers
    -0.69
    luaj
    -0.68
    ses
    -0.68
    byss
    -0.67
    opez
    -0.66
    enne
    -0.65
    POSITIVE LOGITS
     why
    0.93
     unacceptable
    0.93
     happening
    0.91
     how
    0.90
     NOT
    0.85
     what
    0.84
     supposed
    0.84
     shaping
    0.81
     bullshit
    0.80
     definitely
    0.80
    Act Density 0.097%

    No Known Activations