INDEX
Explanations
references to cultural heritage
New Auto-Interp
Negative Logits
ont
-0.54
thin
-0.53
Harg
-0.52
للاسماء
-0.51
tasting
-0.49
scriptcase
-0.49
пати
-0.49
acceptez
-0.48
taste
-0.48
slutt
-0.48
POSITIVE LOGITS
outreach
0.83
Engagement
0.75
AGEMENT
0.75
Engagement
0.73
engagement
0.72
مرئيه
0.72
&=
0.71
AttributeSet
0.68
Outreach
0.67
locu
0.66
Activations Density 0.068%