INDEX
Explanations
words containing the substring "h" followed by a verb
repetitions of the letter "h"
New Auto-Interp
Negative Logits
ãĤ´ãĥ³
-0.85
ãĥ¯
-0.78
bilt
-0.69
éĹĺ
-0.69
fman
-0.65
ãĥ¼ãĥĨãĤ£
-0.65
ãĥ´ãĤ¡
-0.64
Beware
-0.64
Clause
-0.63
EStream
-0.63
POSITIVE LOGITS
oused
1.38
ousing
1.24
ulk
1.21
acking
1.20
anging
1.19
idd
1.18
ospital
1.17
olly
1.16
awk
1.15
anky
1.15
Activations Density 0.022%