INDEX
Explanations
variable declarations in code
New Auto-Interp
Negative Logits
"):
-0.99
)");
-0.96
").
-0.95
).</
-0.92
()).
-0.91
)";
-0.87
%).
-0.87
]").
-0.86
?).
-0.86
”).
-0.86
POSITIVE LOGITS
v
1.43
V
1.39
v
1.36
V
1.34
getV
1.32
Vv
1.07
vv
1.01
vv
1.00
Bv
0.95
zv
0.93
Activations Density 0.195%