INDEX
    Explanations

    words associated with destruction or damage

    New Auto-Interp
    Negative Logits
    eve
    -0.89
    birth
    -0.78
    ORGE
    -0.78
    women
    -0.77
    xon
    -0.72
    meal
    -0.71
    yip
    -0.69
     Sakuya
    -0.68
     hemor
    -0.67
    masters
    -0.66
    POSITIVE LOGITS
    anca
    1.46
    eness
    1.10
    anches
    0.92
    anch
    0.91
    ahn
    0.90
    ack
    0.88
    anc
    0.87
    ossom
    0.86
    ilty
    0.86
    oks
    0.85
    Act Density 0.004%

    No Known Activations