INDEX
Explanations
occurrences of words related to actions or verbs in various contexts
New Auto-Interp
Negative Logits
uch
-0.17
enda
-0.15
éĿ
-0.15
abal
-0.15
èħIJ
-0.14
izard
-0.14
umph
-0.14
viar
-0.14
unal
-0.14
belt
-0.13
POSITIVE LOGITS
สว
0.15
826
0.15
catalogue
0.14
sp
0.14
Fisher
0.14
Vid
0.14
Swim
0.14
ex
0.14
cre
0.13
USA
0.13
Activations Density 0.009%