INDEX
Explanations
phrases discussing social justice and advocacy for marginalized communities
New Auto-Interp
Negative Logits
igators
-0.17
ãģĨãģ¡
-0.15
çİ
-0.14
exels
-0.14
anim
-0.13
دار
-0.13
pheric
-0.13
Sexo
-0.13
.djangoproject
-0.13
dued
-0.13
POSITIVE LOGITS
instead
0.23
we
0.21
rather
0.21
Instead
0.21
any
0.20
society
0.19
We
0.19
let
0.18
Mr
0.18
far
0.18
Activations Density 0.232%