INDEX
Explanations
elements that signal the start of a new section or topic in written text
New Auto-Interp
Negative Logits
houſe
-0.97
Diſ
-0.97
pleaſure
-0.96
Monfieur
-0.94
ſta
-0.94
himſelf
-0.92
myſelf
-0.91
ſever
-0.90
themſelves
-0.90
Jefus
-0.90
POSITIVE LOGITS
оригіналу
0.65
しつつ
0.61
of
0.57
IndentedString
0.54
autorytatywna
0.53
Билгалдахарш
0.51
live
0.50
cél
0.48
toward
0.48
auprès
0.48
Activations Density 0.320%