INDEX
Explanations
repeated phrases, particularly articles and demonstratives, indicating focus on specific references in the text
Followed by adjectives
determiners followed by nouns
New Auto-Interp
Negative Logits
the
-0.45
-0.44
parcialmente
-0.35
Widerspruch
-0.35
a
-0.34
Landwirtschaft
-0.33
The
-0.33
begin
-0.32
with
-0.32
Stufe
-0.32
POSITIVE LOGITS
<unused43>
1.09
<unused41>
1.09
<unused16>
1.09
<unused28>
1.09
<unused3>
1.09
[@BOS@]
1.09
<unused8>
1.09
<unused14>
1.09
<unused51>
1.09
<unused52>
1.09
Activations Density 1.352%