INDEX
Explanations
phrases that refer to fractions or divisions of larger units
New Auto-Interp
Negative Logits
ìĬµ
-0.16
Ñĥже
-0.15
yo
-0.15
ewise
-0.14
ching
-0.14
/fw
-0.14
zer
-0.14
gen
-0.14
umer
-0.14
elu
-0.14
POSITIVE LOGITS
afia
0.16
ÙĨ
0.16
lies
0.15
ãĥ¼ãĥĪ
0.15
ystal
0.15
ystone
0.15
äch
0.15
bread
0.15
ìĦľëĬĶ
0.14
indrome
0.14
Activations Density 0.016%