INDEX
Explanations
references to the "Old" in various contexts
New Auto-Interp
Negative Logits
.cx
-0.17
alyze
-0.15
168
-0.15
alm
-0.15
391
-0.15
atrix
-0.14
789
-0.14
ote
-0.14
izer
-0.14
ushi
-0.14
POSITIVE LOGITS
enburg
0.29
endor
0.27
ÅĻich
0.26
-fashioned
0.26
fashioned
0.24
ENDOR
0.23
/New
0.21
ham
0.20
Testament
0.20
/new
0.20
Activations Density 0.029%