INDEX
Explanations
comparisons or contrasts
New Auto-Interp
Negative Logits
renheit
-0.75
nell
-0.70
ilit
-0.69
xus
-0.69
biz
-0.69
vous
-0.69
plete
-0.68
vich
-0.67
scrib
-0.66
uably
-0.66
POSITIVE LOGITS
sexes
1.68
sides
1.44
halves
1.40
genders
1.37
parties
0.96
extremes
0.92
coasts
0.89
Houses
0.88
ends
0.86
thirds
0.82
Activations Density 2.436%