INDEX
Explanations
references to seas and oceans
New Auto-Interp
Negative Logits
ãĥ³ãĥĶ
-0.17
erable
-0.15
äºİ
-0.15
asti
-0.14
éal
-0.14
gebung
-0.14
ssc
-0.14
мп
-0.14
duk
-0.14
ultural
-0.14
POSITIVE LOGITS
avor
0.16
yn
0.16
lant
0.15
ug
0.15
ante
0.14
obs
0.14
fur
0.14
SHARE
0.13
YN
0.13
VS
0.13
Activations Density 0.006%