INDEX
Explanations
discussions around socio-political and cultural criticism, particularly related to privilege and systemic issues
New Auto-Interp
Negative Logits
ior
-0.17
oot
-0.15
ãĥ³ãĤ¹
-0.15
.jetbrains
-0.15
est
-0.14
WA
-0.14
emm
-0.14
WA
-0.13
Fol
-0.13
cho
-0.13
POSITIVE LOGITS
xCB
0.17
OMIC
0.16
ymes
0.15
Hüs
0.15
defer
0.15
deniz
0.15
одÑĭ
0.15
ddy
0.15
elijk
0.14
//{{0.14
Activations Density 0.006%