INDEX
Explanations
Discourse Processing, Analysis, Representation, marker
New Auto-Interp
Negative Logits
س
0.88
де
0.87
ิ
0.85
า
0.75
ো
0.72
ა
0.72
و
0.70
ح
0.70
are
0.68
ो
0.68
POSITIVE LOGITS
\
0.73
enraged
0.71
relacion
0.66
entidades
0.66
gobern
0.66
Partido
0.66
Hochzeit
0.64
ڨ
0.64
0.64
Ṕ
0.63
Activations Density 0.001%