INDEX
Explanations
words ending in 'h' with a high activation value
instances of the symbol 'h'
New Auto-Interp
Negative Logits
ãĤ´ãĥ³
-0.91
ãĥ¯
-0.89
éĹĺ
-0.87
ãĥ¼ãĥĨãĤ£
-0.74
ortium
-0.74
ãĥ´ãĤ¡
-0.73
DragonMagazine
-0.73
EStream
-0.69
ierrez
-0.67
ãĥ¼ãĤ¯
-0.67
POSITIVE LOGITS
oused
1.27
ousing
1.17
awk
1.14
ulk
1.12
acking
1.12
anging
1.09
idd
1.08
annah
1.06
olly
1.05
oney
1.05
Activations Density 0.018%