INDEX
Explanations
mentions of the word "Gram" and its related forms, indicating a focus on measurements and metrics
New Auto-Interp
Negative Logits
657
-0.17
↵↵
-0.17
uien
-0.16
gue
-0.16
utters
-0.15
¼åIJĪ
-0.15
tuÄŁ
-0.15
eft
-0.15
ÑĤо
-0.14
bane
-0.14
POSITIVE LOGITS
ophone
0.35
à¥Ģण
0.28
mys
0.28
erc
0.27
atical
0.27
bling
0.25
mer
0.25
matic
0.25
sci
0.24
atically
0.23
Activations Density 0.005%