INDEX
Explanations
specific actions or processes related to growth, introduction, or assessment
phrasal constructions
New Auto-Interp
Negative Logits
a
-0.28
-0.27
consequently
-0.26
比べて
-0.25
الع
-0.25
even
-0.25
t
-0.24
q
-0.24
arrep
-0.24
necessarily
-0.23
POSITIVE LOGITS
uxxxx
1.05
препратки
0.98
للمعارف
0.95
queſta
0.95
ロウィン
0.95
ſind
0.88
témoig
0.88
ostavi
0.87
typelib
0.87
zwiſchen
0.86
Activations Density 0.109%