INDEX
Explanations
references to production or something being produced
New Auto-Interp
Negative Logits
XF
-0.16
bate
-0.16
çĴĥ
-0.14
İÅŀ
-0.14
arten
-0.14
517
-0.14
orte
-0.14
ulumi
-0.14
mÃŃt
-0.14
lage
-0.14
POSITIVE LOGITS
gett
0.25
posit
0.24
iet
0.21
ble
0.21
getti
0.19
iez
0.19
iel
0.19
ÑĶ
0.19
iek
0.19
getto
0.18
Activations Density 0.008%