INDEX
Explanations
instances of the word "rough" and its variations
New Auto-Interp
Negative Logits
ipar
-0.16
zl
-0.15
osy
-0.14
hta
-0.14
ilated
-0.14
ÙĪÙħÛĮ
-0.14
illary
-0.14
atrix
-0.14
ossal
-0.14
kses
-0.13
POSITIVE LOGITS
wayne
0.15
uess
0.15
lag
0.15
.pad
0.14
ti
0.14
erie
0.14
imiter
0.14
nego
0.13
å¼¥
0.13
lái
0.13
Activations Density 0.006%