INDEX
Explanations
phrases related to power dynamics and societal issues
New Auto-Interp
Negative Logits
TokenName
-0.17
.cljs
-0.13
-ÑĤаки
-0.13
ÙħÙĨÛĮ
-0.13
.scalablytyped
-0.12
/Dk
-0.12
amerate
-0.12
_DECREF
-0.12
'].'/
-0.12
inspace
-0.12
POSITIVE LOGITS
if
1.12
If
0.81
nếu
0.79
еÑģли
0.78
jika
0.75
If
0.74
if
0.73
å¦Ĥæŀľ
0.72
if
0.72
wenn
0.69
Activations Density 1.706%