INDEX
Explanations
numerical data related to measurements and specifications
New Auto-Interp
Negative Logits
à¹Īà¸ĩ
-0.15
rop
-0.15
ooks
-0.15
entine
-0.15
ledo
-0.15
odds
-0.15
OLON
-0.14
ells
-0.14
orem
-0.13
cigars
-0.13
POSITIVE LOGITS
.tw
0.23
-tw
0.18
.f
0.17
zer
0.17
zero
0.17
zero
0.16
tw
0.16
.two
0.16
,Th
0.15
-zero
0.15
Activations Density 0.028%