INDEX
Explanations
phrases related to various societal issues or problems
phrases that discuss various issues impacting society
New Auto-Interp
Negative Logits
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
-0.76
Catalog
-0.74
ername
-0.71
ffen
-0.69
âĹ¼
-0.68
rites
-0.68
âķIJâķIJ
-0.66
gian
-0.66
ãĤ¼ãĤ¦ãĤ¹
-0.65
CBC
-0.64
POSITIVE LOGITS
whether
1.00
legality
0.89
fairness
0.84
homelessness
0.84
affordability
0.80
homosexuality
0.76
protecting
0.74
needing
0.74
sexuality
0.72
equality
0.72
Activations Density 0.087%