INDEX
Explanations
modal verbs indicating possibility or future actions
New Auto-Interp
Negative Logits
favor
-0.15
eger
-0.15
ennie
-0.14
ect
-0.14
.sigmoid
-0.14
Fitz
-0.14
foy
-0.14
Hutch
-0.14
æºĸ
-0.13
pes
-0.13
POSITIVE LOGITS
bare
0.15
è¼Ŀ
0.14
ture
0.14
ç¿°
0.14
hei
0.14
akan
0.14
vise
0.14
cuent
0.14
heim
0.14
iface
0.13
Activations Density 0.000%