INDEX
Explanations
personal pronouns in German
New Auto-Interp
Negative Logits
виправивши
-1.11
Anſ
-1.05
myſelf
-1.04
Diſ
-0.98
Reſ
-0.96
itſelf
-0.96
anſ
-0.95
himſelf
-0.94
Theſe
-0.94
ſelf
-0.93
POSITIVE LOGITS
</i>
0.60
“
0.60
[
0.59
</b>
0.58
"
0.56
',
0.52
a
0.50
(
0.50
<bos>
0.49
|
0.49
Activations Density 2.189%