INDEX
Explanations
abbreviations and initialisms
New Auto-Interp
Negative Logits
一定的
0.46
يا
0.45
ت
0.44
其他
0.42
يع
0.42
дел
0.41
غير
0.40
Ngoài
0.40
樾
0.39
也是
0.38
POSITIVE LOGITS
for
0.59
ur
0.49
lesions
0.49
hewan
0.45
troops
0.45
beverages
0.44
인한
0.44
pháp
0.44
নিষ্পত্তি
0.44
O
0.43
Activations Density 0.078%