INDEX
    Explanations

    words related to legal and social justice issues

    New Auto-Interp
    Negative Logits
    lev
    -0.17
    heiro
    -0.16
    urn
    -0.16
    elin
    -0.15
    ROUGH
    -0.14
    _CT
    -0.14
    ITO
    -0.14
    marvin
    -0.14
    _GPU
    -0.14
    ÃŃÅ¡e
    -0.13
    POSITIVE LOGITS
    etter
    0.17
    egend
    0.15
    á»ĩ
    0.14
    дон
    0.14
    endi
    0.14
    imar
    0.14
    onder
    0.14
    afka
    0.14
    754
    0.14
    kop
    0.14
    Act Density 0.002%

    No Known Activations