INDEX
Explanations
phrases indicating a comparison or relationship between two entities
mentions of pairs or groups of two entities
New Auto-Interp
Negative Logits
renheit
-0.84
ugu
-0.79
ovi
-0.78
annel
-0.72
nect
-0.71
iggins
-0.71
za
-0.70
asta
-0.70
Ô
-0.70
fw
-0.68
POSITIVE LOGITS
halves
1.46
thirds
1.24
sides
1.12
sexes
1.06
fold
1.04
Kore
0.98
dozen
0.91
extremes
0.89
main
0.84
aforementioned
0.81
Activations Density 0.057%