INDEX
Explanations
instances where there is a comparison or contrast between two entities
discussions of relationships or comparisons between two entities or groups
New Auto-Interp
Negative Logits
vich
-0.87
enhagen
-0.82
xus
-0.79
ucked
-0.79
anooga
-0.78
ertodd
-0.73
maxwell
-0.73
nown
-0.72
itness
-0.72
renheit
-0.71
POSITIVE LOGITS
halves
1.39
sexes
1.34
extremes
1.26
sides
1.19
genders
1.15
worlds
1.10
realms
1.10
parties
1.06
eras
1.03
cultures
0.98
Activations Density 0.126%