INDEX
Explanations
http parameters and code snippets
New Auto-Interp
Negative Logits
-
0.94
......
0.87
-(
0.86
**
0.86
.......
0.83
....
0.82
++
0.82
........
0.82
-.
0.81
–
0.81
POSITIVE LOGITS
<unused0>
0.74
முற
0.64
instead
0.64
Instead
0.60
+
0.60
Fach
0.59
does
0.58
ဟု
0.58
Charm
0.58
<unused1056>
0.58
Activations Density 0.000%