INDEX
Explanations
phrases indicating action or movement
New Auto-Interp
Negative Logits
ntag
-0.19
rana
-0.17
pok
-0.16
rac
-0.15
ivil
-0.15
esk
-0.14
ippi
-0.14
ahoma
-0.14
opor
-0.14
ndl
-0.14
POSITIVE LOGITS
solo
0.17
.cloudflare
0.17
@Spring
0.16
ef
0.16
Convention
0.15
erness
0.15
endum
0.15
ìŀij
0.15
они
0.15
alone
0.14
Activations Density 0.256%