INDEX
Explanations
instances of the word "Both" followed by a comparison or contrast between two entities or concepts
New Auto-Interp
Negative Logits
xus
-0.77
ugu
-0.77
nect
-0.75
renheit
-0.75
uably
-0.74
ilit
-0.71
uable
-0.70
nowhere
-0.68
biz
-0.68
vich
-0.68
POSITIVE LOGITS
sexes
1.68
sides
1.57
halves
1.37
genders
1.36
parties
1.09
coasts
0.92
extremes
0.91
Houses
0.90
ends
0.86
directions
0.81
Activations Density 0.360%