INDEX
Explanations
phrases related to social justice issues and their complexities
New Auto-Interp
Negative Logits
æĹ¢çĦ¶
-0.15
spender
-0.14
erton
-0.14
ãģļ
-0.14
irebase
-0.14
duk
-0.14
è¡Ĩ
-0.14
ohn
-0.13
esign
-0.13
çünkü
-0.13
POSITIVE LOGITS
almost
0.35
almost
0.31
literally
0.26
nearly
0.25
barely
0.25
Almost
0.25
even
0.25
Almost
0.24
neredeyse
0.24
virtually
0.24
Activations Density 0.221%