INDEX
Explanations
references to organizations and partnerships involved in social justice and philanthropic efforts
New Auto-Interp
Negative Logits
+#+#
-0.68
gradable
-0.60
biografias
-0.52
inac
-0.46
AxisAlignment
-0.45
Initially
-0.44
+:+
-0.44
zulegen
-0.44
かも
-0.44
lando
-0.44
POSITIVE LOGITS
committed
1.20
working
1.14
committed
1.08
worked
1.06
arbej
1.05
working
1.05
works
1.03
Committed
1.01
commit
1.01
strive
1.01
Activations Density 0.240%