INDEX
Explanations
references to duality or pairs in various contexts
New Auto-Interp
Negative Logits
onda
-0.16
uring
-0.15
iali
-0.15
akens
-0.14
-League
-0.14
isp
-0.14
pag
-0.13
िड
-0.13
NR
-0.13
erv
-0.13
POSITIVE LOGITS
sides
0.25
halves
0.24
side
0.23
-side
0.19
directions
0.19
legs
0.18
twin
0.18
æĸ¹åIJij
0.18
Twin
0.17
-sided
0.17
Activations Density 0.154%