INDEX
Explanations
social, emotional, political, legal, sexual, racial categories
economic, legal, social, emotional, financial, sexual aspects
New Auto-Interp
Negative Logits
ي
0.55
يا
0.54
وم
0.53
م
0.52
एस
0.51
ير
0.50
يل
0.50
ফ
0.48
ك
0.47
し
0.47
POSITIVE LOGITS
;
0.54
al
0.46
)
0.42
o
0.41
ATING
0.40
UGH
0.40
<0x80>
0.39
a
0.38
akn
0.38
Bupati
0.38
Activations Density 3.065%