INDEX
Explanations
the frequency of the word "for" in various contexts
New Auto-Interp
Negative Logits
ála
-0.17
rech
-0.16
Äħż
-0.15
forg
-0.14
privacy
-0.14
Æ°á»Ľ
-0.14
viso
-0.14
ýš
-0.14
ctl
-0.14
oliday
-0.14
POSITIVE LOGITS
ãĥ¼ãĥĭ
0.19
ermen
0.15
ante
0.15
شر
0.15
å±¥
0.15
ÃĹ↵↵
0.15
usta
0.14
omer
0.14
portal
0.14
Starr
0.14
Activations Density 0.025%