INDEX
Explanations
phrases indicating location or place
New Auto-Interp
Negative Logits
rok
-0.15
aza
-0.15
Mor
-0.15
frequent
-0.15
radan
-0.14
ims
-0.14
egin
-0.13
stance
-0.13
hz
-0.13
.wik
-0.13
POSITIVE LOGITS
ardon
0.16
@admin
0.15
جاد
0.14
acio
0.14
deo
0.14
ìĨĶ
0.14
authenticated
0.14
.lesson
0.13
yle
0.13
quires
0.13
Activations Density 0.352%