INDEX
Explanations
commentary or annotations used for code
New Auto-Interp
Negative Logits
------------------------------------------------
-0.19
----------------------------------------------------------------
-0.18
----------------
-0.16
--------------------------------------------------------------------------------
-0.16
wire
-0.16
================================================================
-0.15
--------------------------------
-0.15
--------------------
-0.15
sg
-0.15
swire
-0.15
POSITIVE LOGITS
s
0.17
TODO
0.16
undry
0.15
nds
0.15
ÏĤ
0.14
sik
0.14
Ùĩ
0.14
iance
0.14
ůj
0.14
itage
0.14
Activations Density 0.095%