INDEX
Explanations
instances where two entities are being compared or contrasted
New Auto-Interp
Negative Logits
ļéĨĴ
-0.80
renheit
-0.79
nect
-0.76
cise
-0.72
ocaust
-0.72
ugu
-0.72
uable
-0.71
Enlarge
-0.71
istle
-0.69
ylon
-0.69
POSITIVE LOGITS
sides
1.56
sexes
1.46
parties
1.31
genders
1.28
halves
1.20
factions
0.98
versions
0.98
men
0.95
Houses
0.94
Parties
0.92
Activations Density 0.054%