INDEX
    Explanations

    expressions of subjective feelings or thoughts

    phrases that express feelings of comparison or similarity

    New Auto-Interp
    Negative Logits
    byn
    -0.76
    oust
    -0.71
    conservancy
    -0.70
    omen
    -0.68
    alt
    -0.66
    itions
    -0.65
    ertain
    -0.64
    edient
    -0.63
    ais
    -0.63
    DonaldTrump
    -0.62
    POSITIVE LOGITS
     crap
    1.02
     shit
    0.92
    lier
    0.84
     pulling
    0.78
     stepping
    0.75
     spitting
    0.74
     jumping
    0.73
     throwing
    0.72
     admitting
    0.72
     quitting
    0.70
    Act Density 0.032%

    No Known Activations