INDEX
    Explanations

    words related to strong negative expressions or criticism

    negative evaluations or criticisms

    New Auto-Interp
    Negative Logits
    uclear
    -0.76
    isations
    -0.70
    ouf
    -0.67
    teness
    -0.66
    utherford
    -0.66
    Lago
    -0.65
    ATIONS
    -0.65
    ATIONAL
    -0.64
    isation
    -0.63
     nm
    -0.62
    POSITIVE LOGITS
     dick
    1.05
    sylvania
    0.95
    asses
    0.95
     suck
    0.93
    eries
    0.89
    shit
    0.84
    hots
    0.83
    bowl
    0.83
    driver
    0.82
    loads
    0.81
    Act Density 0.013%

    No Known Activations