INDEX
Explanations
discussions about rights, governance, and social justice issues
New Auto-Interp
Negative Logits
ourg
-0.17
urat
-0.15
Ģìŀ¥
-0.15
riad
-0.14
iques
-0.14
ãĤµãĤ¤
-0.14
averse
-0.14
ibold
-0.14
adesh
-0.14
assi
-0.14
POSITIVE LOGITS
need
0.41
need
0.36
Need
0.33
Need
0.31
reck
0.28
_need
0.24
care
0.24
aim
0.24
NEED
0.23
unjust
0.23
Activations Density 0.592%