INDEX
Explanations
I am / I understand / I cannot
New Auto-Interp
Negative Logits
:
0.85
;
0.82
by
0.80
},
0.79
,
0.79
.
0.77
-
0.75
the
0.74
,"
0.70
during
0.69
POSITIVE LOGITS
نے
0.68
aculate
0.68
Phones
0.66
ਨ੍ਹਾਂ
0.65
げ
0.65
쁜
0.65
垐
0.65
㤈
0.64
utenant
0.63
brahim
0.63
Activations Density 0.060%