INDEX
Explanations
terms related to ongoing processes or actions
New Auto-Interp
Negative Logits
revis
-0.19
à¥Ģà¤ķरण
-0.18
ноÑģи
-0.17
stro
-0.16
биÑĤ
-0.16
travers
-0.15
pras
-0.15
ishment
-0.15
UpInside
-0.15
ipation
-0.14
POSITIVE LOGITS
ify
0.38
ize
0.33
ulate
0.32
itize
0.32
enate
0.32
ise
0.31
erate
0.31
inate
0.30
uate
0.30
inize
0.30
Activations Density 0.300%