INDEX
Explanations
relationships between entities and their attributes or qualities
conjunctions in lists
New Auto-Interp
Negative Logits
ьаж
-0.61
Paglinawan
-0.60
للمعارف
-0.60
featureID
-0.59
hyrchwyd
-0.57
нгред
-0.57
rrggbb
-0.56
AnchorTagHelper
-0.56
kháu
-0.56
surla
-0.54
POSITIVE LOGITS
me
0.32
rste
0.30
particular
0.30
particularly
0.29
great
0.28
roh
0.28
حض
0.28
ticularly
0.27
istik
0.27
adan
0.27
Activations Density 0.428%