INDEX
Explanations
negations or expressions of doubt in the text
New Auto-Interp
Negative Logits
s
-0.90
ς
-0.47
ی
-0.45
の
-0.43
Verhält
-0.41
י
-0.41
es
-0.41
ים
-0.41
.
-0.40
whose
-0.40
POSITIVE LOGITS
<bos>
0.82
незавершена
0.60
kaarangay
0.58
SourceChecksum
0.56
AccessorTable
0.55
findpost
0.55
препратки
0.54
Superhost
0.54
ſicht
0.52
виправивши
0.51
Activations Density 0.102%