INDEX
Explanations
temporal conjunctions followed by conditional phrases
New Auto-Interp
Negative Logits
a
0.66
is
0.61
the
0.58
was
0.57
2
0.55
an
0.53
hearted
0.49
inspired
0.48
:
0.47
springs
0.47
POSITIVE LOGITS
Сан
0.60
銠
0.58
يد
0.56
électron
0.56
ιν
0.55
ين
0.54
Сейчас
0.54
ري
0.53
حي
0.53
Види
0.53
Activations Density 0.000%