INDEX
Explanations
words related to social justice and community support initiatives
New Auto-Interp
Negative Logits
sworth
-0.18
-0.18
li
-0.18
ning
-0.17
lo
-0.17
ìĿĦ
-0.16
ry
-0.16
ra
-0.16
liness
-0.16
Ìĥ
-0.16
POSITIVE LOGITS
ez
0.16
ύ
0.15
AGE
0.15
-minded
0.15
-looking
0.14
ALLY
0.14
ehler
0.14
y
0.14
iative
0.13
element
0.13
Activations Density 0.151%