INDEX
Explanations
different references to "other" entities or categories
New Auto-Interp
Negative Logits
lainnya
-0.83
الأخرى
-0.74
AddTagHelper
-0.73
berikutnya
-0.70
autre
-0.69
restantes
-0.66
mourut
-0.66
others
-0.65
JSTOR
-0.65
andern
-0.64
POSITIVE LOGITS
worldly
1.45
than
1.16
than
0.85
था
0.82
Than
0.81
world
0.78
niż
0.77
THAN
0.76
ness
0.74
kinds
0.73
Activations Density 0.140%