INDEX
Explanations
phrases indicating presence or involvement in activities
New Auto-Interp
Negative Logits
ampo
-0.07
inka
-0.07
759
-0.07
ëĤĺ
-0.07
gio
-0.07
edin
-0.07
zion
-0.06
a
-0.06
deo
-0.06
вед
-0.06
POSITIVE LOGITS
nis
0.08
fty
0.08
/out
0.07
organic
0.07
shore
0.07
toItem
0.07
průbÄĽhu
0.06
šov
0.06
úsqueda
0.06
whole
0.06
Activations Density 0.069%