INDEX
Explanations
instances of the prefix "Wh" often found in proper nouns or interrogative contexts
New Auto-Interp
Negative Logits
y
-0.18
yw
-0.17
æŀIJ
-0.17
yh
-0.17
abled
-0.16
phase
-0.15
ãĥ¼ãĤ¸
-0.15
ei
-0.15
abel
-0.15
ez
-0.14
POSITIVE LOGITS
Wh
0.21
arton
0.21
olesale
0.19
foods
0.19
oles
0.18
곡
0.18
isk
0.18
Wh
0.17
ichever
0.17
ims
0.17
Activations Density 0.012%