INDEX
Explanations
references to awards or recognitions in films
New Auto-Interp
Negative Logits
تضيفلها
-0.90
الحياه
-0.83
beginnetje
-0.82
AxisAlignment
-0.81
StringCopy
-0.81
<bos>
-0.80
ब्रेकडाउन
-0.80
AddTagHelper
-0.79
יצוני
-0.78
Portail
-0.78
POSITIVE LOGITS
↵↵
0.64
0.47
joint
0.44
,
0.43
.
0.41
(
0.41
0.41
-
0.41
的同时
0.40
=
0.39
Activations Density 0.865%