INDEX
Explanations
occurrences of the word "two" and its variations in various contexts
New Auto-Interp
Negative Logits
../../
-0.21
../
-0.17
th
-0.17
stuff
-0.15
rd
-0.15
st
-0.15
era
-0.15
.openConnection
-0.15
raj
-0.15
../../../
-0.15
POSITIVE LOGITS
gether
0.25
/th
0.21
-faced
0.20
nd
0.20
handed
0.19
halves
0.19
sides
0.19
-thirds
0.19
fer
0.18
Kore
0.18
Activations Density 0.130%