INDEX
Explanations
phrases related to social or political issues
New Auto-Interp
Negative Logits
bia
-0.76
hap
-0.70
Ruin
-0.67
ofi
-0.66
Ramp
-0.65
itars
-0.65
Glac
-0.64
overty
-0.63
olulu
-0.62
Parenthood
-0.61
POSITIVE LOGITS
soType
0.80
ature
0.80
Hug
0.68
conclud
0.66
table
0.65
Table
0.64
analytical
0.63
rences
0.63
analys
0.63
isSpecialOrderable
0.63
Activations Density 7.623%