INDEX
Explanations
keywords related to information or news reports, such as "On arrival," "There," "Just," "The," "What," etc
instances of introductory phrases or sentence openings
New Auto-Interp
Negative Logits
�
-0.73
.)
-0.67
''
-0.63
cum
-0.61
prompting
-0.60
).
-0.60
Âł Âł Âł Âł
-0.59
listed
-0.59
.)
-0.58
ienne
-0.58
POSITIVE LOGITS
withstanding
1.10
resa
0.99
ntil
0.99
chieve
0.90
%"
0.89
odore
0.86
ctions
0.84
"[
0.81
ircraft
0.81
xiety
0.80
Activations Density 0.236%