INDEX
Explanations
references to the former Egyptian president Hosni Mubarak
New Auto-Interp
Negative Logits
*/(
-0.91
ienced
-0.89
ansas
-0.80
gency
-0.75
tions
-0.75
iences
-0.73
imental
-0.73
manager
-0.73
tons
-0.71
drawn
-0.71
POSITIVE LOGITS
emi
1.01
elsen
0.99
yah
0.88
Äį
0.86
qi
0.86
ya
0.85
Äĩ
0.85
q
0.82
agra
0.81
ptions
0.79
Activations Density 0.005%