INDEX
    Explanations

    phrases related to power dynamics and societal issues

    New Auto-Interp
    Negative Logits
    	TokenName
    -0.17
    .cljs
    -0.13
    -ÑĤаки
    -0.13
    ÙħÙĨÛĮ
    -0.13
    .scalablytyped
    -0.12
    /Dk
    -0.12
    amerate
    -0.12
    _DECREF
    -0.12
    '].'/
    -0.12
    inspace
    -0.12
    POSITIVE LOGITS
     if
    1.12
     If
    0.81
     nếu
    0.79
     еÑģли
    0.78
     jika
    0.75
    If
    0.74
    	if
    0.73
    å¦Ĥæŀľ
    0.72
    if
    0.72
     wenn
    0.69
    Act Density 1.706%

    No Known Activations