INDEX
Explanations
references to human rights reports and statements
human rights violations
New Auto-Interp
Negative Logits
PreferredItem
-0.58
Hentet
-0.55
تضيفلها
-0.52
WriteBarrier
-0.52
InputBorder
-0.50
مشين
-0.50
createSlice
-0.50
gameserver
-0.48
صوتيه
-0.48
+:+
-0.47
POSITIVE LOGITS
0.48
human
0.36
คลิ
0.35
HUMAN
0.35
Kết
0.35
oa̍t
0.35
ayat
0.34
adpleegd
0.34
Bisous
0.34
忘
0.33
Activations Density 0.081%