INDEX
Explanations
instances of the <bos> and <end> tokens within the text
New Auto-Interp
Negative Logits
ConstraintMaker
-0.84
otomatig
-0.82
HostException
-0.75
Geplaatst
-0.73
Paglinawan
-0.73
ویکیپدیا
-0.73
autorytatywna
-0.72
msgTypes
-0.70
Vidite
-0.70
发表于
-0.67
POSITIVE LOGITS
roma
0.54
poffible
0.45
duradero
0.43
digen
0.41
negó
0.40
Romani
0.40
pleaſure
0.39
entuh
0.39
stacles
0.38
romana
0.38
Activations Density 0.078%