INDEX
Explanations
placeholders
New Auto-Interp
Negative Logits
0.62
,
0.52
-
0.51
0.50
:
0.48
in
0.48
for
0.47
(
0.46
and
0.46
.
0.45
POSITIVE LOGITS
<unused2020>
0.46
<unused50>
0.42
<unused1103>
0.41
اونلوډ
0.40
<unused95>
0.40
<unused25>
0.40
<unused73>
0.40
<unused22>
0.39
<unused66>
0.39
followlike
0.38
Activations Density 0.216%