INDEX
Explanations
discussions about procedural changes and their impacts
New Auto-Interp
Negative Logits
ensi
-0.16
å¾ĴæŃ©
-0.15
agi
-0.15
att
-0.15
286
-0.14
GR
-0.14
sex
-0.13
unc
-0.13
.nd
-0.13
adar
-0.13
POSITIVE LOGITS
ucch
0.17
ugen
0.16
внимание
0.14
аÑĤкÑĥ
0.14
Portions
0.14
IVAL
0.14
ohl
0.14
pNet
0.14
ÙĬتÙħ
0.14
antha
0.14
Activations Density 0.320%