INDEX
Explanations
New Auto-Interp
Negative Logits
greateſt
-0.66
myſelf
-0.66
pleaſure
-0.64
Shakspeare
-0.63
neceff
-0.63
lgari
-0.63
purpoſe
-0.61
Monfieur
-0.60
whoſe
-0.60
AndEndTag
-0.60
POSITIVE LOGITS
,
0.56
gy
0.50
saisir
0.49
—
0.49
=
0.48
der
0.47
,「
0.47
()=>{0.47
CHANT
0.45
BuildContext
0.44
Activations Density 7.847%