INDEX
Explanations
instances of low-ranking attribute terms in research contexts
non- phrases
New Auto-Interp
Negative Logits
minecraftforge
-0.43
تضيفلها
-0.38
Macy
-0.38
geho
-0.38
BoxLayout
-0.37
دیکھیے
-0.35
likler
-0.35
Deposit
-0.34
you
-0.33
Bezir
-0.33
POSITIVE LOGITS
‐
1.70
‐
1.33
‑
0.96
﴾
0.96
־
0.83
־ה
0.71
$--
0.69
[''
0.68
﴿
0.67
﹣
0.65
Activations Density 0.026%