INDEX
    Explanations

    names of people being targeted or affected by violence or injustice

    New Auto-Interp
    Negative Logits
    phis
    -1.00
    s
    -0.82
    acular
    -0.82
    ivas
    -0.80
    ivery
    -0.80
    enium
    -0.80
    ertodd
    -0.79
    achusetts
    -0.79
    imates
    -0.78
    neapolis
    -0.78
    POSITIVE LOGITS
    zz
    0.95
    zza
    0.88
    ÄŁ
    0.86
    ÅŁ
    0.85
    gger
    0.83
    ñ
    0.82
    ça
    0.80
    FORE
    0.79
    zzi
    0.76
    legates
    0.74
    Act Density 0.043%

    No Known Activations