INDEX
Explanations
sections of text that indicate feedback or summaries
New Auto-Interp
Negative Logits
(
-0.60
,
-0.59
-0.58
(
-0.56
so
-0.53
for
-0.53
-
-0.53
ing
-0.53
and
-0.51
as
-0.51
POSITIVE LOGITS
findpost
1.15
المعيارى
1.04
")));
1.03
)");
1.01
autorytatywna
1.00
]");
0.99
')));
0.99
tvguidetime
0.98
']))
0.98
'));
0.96
Activations Density 0.158%