INDEX
Explanations
words related to visual differences or comparisons
phrases that describe differences or comparisons between elements
New Auto-Interp
Negative Logits
Mos
-0.73
authorized
-0.69
aye
-0.67
reditary
-0.66
ackers
-0.66
rake
-0.65
Bay
-0.64
aptic
-0.62
bey
-0.61
kn
-0.61
POSITIVE LOGITS
juxtap
0.91
contrasts
0.81
contrasting
0.79
xual
0.77
stark
0.77
ibilities
0.74
ĸļ
0.72
Contrast
0.72
lihood
0.72
sexes
0.71
Activations Density 0.012%