INDEX
Explanations
references to the themes of realization and introspection in various contexts
New Auto-Interp
Negative Logits
θι
-0.14
(!_
-0.14
.AnchorStyles
-0.13
eyse
-0.13
)ìĿĢ
-0.13
'],$_
-0.13
"...
-0.13
_,
-0.13
بازبÛĮÙĨÛĮ
-0.13
eyJ
-0.12
POSITIVE LOGITS
(
0.49
ãĢįï¼Ī
0.42
ãĢıï¼Ī
0.38
ãĢĭï¼Ī
0.38
")(
0.38
)(
0.37
'](
0.36
"(
0.35
')(
0.34
)(
0.34
Activations Density 0.303%