INDEX
    Explanations

    concepts related to community and social critique

    New Auto-Interp
    Negative Logits
    etc
    -0.28
     etc
    -0.24
    çŃī
    -0.22
     ëĵ±
    -0.17
    (or
    -0.16
     çŃī
    -0.16
    vor
    -0.15
    ritz
    -0.15
    ãĢģ
    -0.15
    ãģªãģ©
    -0.15
    POSITIVE LOGITS
     lẫn
    0.38
     AND
    0.35
     as
    0.28
     että
    0.26
    AND
    0.25
     nor
    0.20
    _AND
    0.18
    è¿ĺæĺ¯
    0.17
    	AND
    0.16
     quanto
    0.16
    Act Density 0.147%

    No Known Activations