INDEX
Explanations
phrases related to comparing differences between various concepts or entities
phrases that convey comparisons or distinctions
New Auto-Interp
Negative Logits
bye
-0.80
OGR
-0.79
avascript
-0.74
dn
-0.74
nell
-0.73
ihilation
-0.72
der
-0.72
ãĤ¡
-0.71
igion
-0.70
idth
-0.70
POSITIVE LOGITS
sexes
0.94
genders
0.91
halves
0.79
those
0.74
these
0.70
eras
0.66
what
0.66
them
0.65
Ware
0.61
sender
0.60
Activations Density 0.042%