INDEX
Explanations
snippets from different kinds of documents
part, aspect, beginning
New Auto-Interp
Negative Logits
<unused61>
-1.08
<unused62>
-1.06
<unused63>
-1.03
↵
-1.00
1
-0.98
<eos>
-0.95
-0.94
<unused60>
-0.94
...
-0.93
↵↵
-0.91
POSITIVE LOGITS
Theſe
2.00
Monfieur
1.90
Efq
1.88
myſelf
1.84
itſelf
1.71
Мексичка
1.59
purpoſe
1.58
Anſ
1.56
ſeveral
1.55
Jefus
1.54
Activations Density 12.738%