INDEX
Explanations
markers indicating the beginning of new sections or paragraphs
New Auto-Interp
Negative Logits
'\\;'
-0.83
ItemBackground
-0.81
Monfieur
-0.76
moiselle
-0.74
Jefus
-0.73
Beſ
-0.71
PhysRevD
-0.71
otomatig
-0.71
Theſe
-0.71
ſch
-0.70
POSITIVE LOGITS
تضيفلها
0.60
]")]
0.59
↵
0.58
<eos>
0.57
mstyle
0.53
Disqus
0.50
"
0.48
(
0.48
0.47
שוליים
0.46
Activations Density 0.010%