INDEX
Explanations
the verb "do" and its various forms
New Auto-Interp
Negative Logits
457
-0.15
antly
-0.15
azer
-0.15
kol
-0.15
rox
-0.14
foon
-0.14
stras
-0.14
trak
-0.14
anse
-0.14
OfSize
-0.14
POSITIVE LOGITS
estre
0.18
xygen
0.15
bserv
0.15
justice
0.14
Comparator
0.14
ayn
0.14
cket
0.13
-done
0.13
inand
0.13
inth
0.13
Activations Density 0.100%