INDEX
Explanations
phrases discussing the number "two"
mentions of the number two
New Auto-Interp
Negative Logits
¬¼
-0.74
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
-0.68
erker
-0.67
eez
-0.67
Ł
-0.65
enture
-0.64
taboola
-0.64
humane
-0.62
phi
-0.62
pmwiki
-0.61
POSITIVE LOGITS
namely
1.25
totaling
1.06
viz
1.05
apiece
1.05
each
0.94
respectively
0.92
two
0.87
consecut
0.85
halves
0.74
ones
0.73
Activations Density 0.354%