INDEX
Explanations
significant verbs and nouns related to actions and interactions
New Auto-Interp
Negative Logits
aka
-0.17
imator
-0.15
athan
-0.15
agan
-0.15
mapper
-0.14
èĬĤ
-0.14
stad
-0.14
arella
-0.14
attr
-0.13
ÑĨÑı
-0.13
POSITIVE LOGITS
unately
0.17
ically
0.16
ively
0.16
ishly
0.16
oger
0.16
edly
0.15
istically
0.15
urch
0.15
ally
0.15
-of
0.15
Activations Density 0.005%