INDEX
    Explanations

    negative descriptions or critiques

    words related to vilification and criticism

    New Auto-Interp
    Negative Logits
     cropped
    -0.67
    hr
    -0.65
     newsp
    -0.61
    print
    -0.60
     hearty
    -0.60
    Self
    -0.60
     Sevent
    -0.60
    Publisher
    -0.59
     Requ
    -0.59
    Ģ
    -0.58
    POSITIVE LOGITS
    vil
    1.06
    icious
    0.86
    chio
    0.84
    ibrary
    0.81
    ionage
    0.80
    theless
    0.79
    zx
    0.79
     Nadu
    0.78
     destro
    0.78
    arge
    0.77
    Act Density 0.006%

    No Known Activations