INDEX
Explanations
the beginning and ending tags of sections or paragraphs in structured text
New Auto-Interp
Negative Logits
purpoſe
-1.12
themſelves
-1.09
himſelf
-1.08
myſelf
-1.07
houſe
-1.04
itſelf
-1.01
ſeveral
-1.00
Anſ
-1.00
iſt
-0.99
reaſon
-0.99
POSITIVE LOGITS
And
1.04
và
1.04
και
1.03
и
0.97
and
0.96
そして
0.92
\&
0.91
و
0.87
AND
0.86
och
0.82
Activations Density 0.029%