INDEX
Explanations
references to page numbers or citations in texts
New Auto-Interp
Negative Logits
наÑĢод
-0.15
-Token
-0.14
altet
-0.14
VD
-0.14
@nate
-0.14
XA
-0.14
à¸Ļà¸ģ
-0.13
Pey
-0.13
à¹ĩà¸Ķ
-0.13
_UNUSED
-0.13
POSITIVE LOGITS
vi
0.17
3
0.16
Kindle
0.16
unp
0.15
ii
0.15
hana
0.14
34
0.14
ainer
0.14
facing
0.14
76
0.14
Activations Density 0.055%