INDEX
Explanations
references to pairs or dualities
"Two" followed by a noun
two followed by multiple
New Auto-Interp
Negative Logits
ly
-0.65
algunos
-0.64
allemaal
-0.64
fully
-0.63
algunos
-0.61
algunas
-0.61
meerdere
-0.60
variés
-0.59
一个个
-0.59
alguno
-0.58
POSITIVE LOGITS
halves
1.26
sides
1.06
+#+#
0.94
opposing
0.91
sets
0.89
extremes
0.88
thirds
0.85
separate
0.84
main
0.83
ends
0.83
Activations Density 0.284%