INDEX
Explanations
phrases indicating things that function effectively or exceed expectations
New Auto-Interp
Negative Logits
Outs
-0.18
outs
-0.18
.comp
-0.17
ubic
-0.16
оÑı
-0.16
outline
-0.15
outs
-0.15
elsey
-0.15
NullOr
-0.15
quip
-0.15
POSITIVE LOGITS
gate
0.35
Gate
0.29
gates
0.28
gate
0.28
Gate
0.27
_gate
0.25
Gates
0.25
blocks
0.20
éĸ
0.18
éĸĢ
0.18
Activations Density 0.014%