INDEX
Explanations
characters or symbols used in written languages
New Auto-Interp
Negative Logits
+
-0.60
────────
-0.59
-0.57
-0.57
=
-0.56
.
-0.56
-0.55
-0.54
-0.54
addContainerGap
-0.54
POSITIVE LOGITS
्
0.58
ь
0.57
્
0.56
্
0.55
े
0.54
่
0.54
ं
0.53
้
0.52
setIs
0.51
ि
0.51
Activations Density 0.501%