INDEX
Explanations
markers indicating the start of new sections or important content within texts
New Auto-Interp
Negative Logits
es
-0.71
io
-0.69
Paz
-0.65
Paz
-0.64
<blockquote>
-0.63
de
-0.63
paz
-0.63
ly
-0.62
lat
-0.62
deals
-0.61
POSITIVE LOGITS
}^{*}$1.39
$=$
1.36
$)$
1.34
}}$,
1.33
$]$
1.28
$>$
1.28
$\$$
1.27
}}$
1.27
)}$
1.25
})}$
1.23
Activations Density 0.223%