INDEX
Explanations
descriptions of people and their actions
concepts related to social justice and inequality
New Auto-Interp
Negative Logits
]).
-0.63
SPONSORED
-0.59
>]
-0.57
Gene
-0.57
Auth
-0.57
]),
-0.56
arthed
-0.55
})
-0.54
Ħ¢
-0.54
Last
-0.53
POSITIVE LOGITS
however
0.54
flatt
0.54
though
0.53
everywhere
0.50
meanwhile
0.49
ometimes
0.47
insofar
0.47
ital
0.47
orphans
0.47
subdiv
0.46
Activations Density 2.100%