INDEX
Explanations
references to responsibilities and accusations of murder or harm towards specific groups or individuals
sentence starters following "the" or "and"
New Auto-Interp
Negative Logits
CreateTagHelper
-0.57
betweenstory
-0.52
ValueStyle
-0.52
مشين
-0.51
点此举报
-0.48
featureID
-0.48
ostavi
-0.47
thâu
-0.47
հղումներ
-0.46
PeEnEo
-0.45
POSITIVE LOGITS
list
2.47
List
1.88
list
1.88
lista
1.86
lists
1.80
LIST
1.70
Liste
1.67
liste
1.66
List
1.64
список
1.58
Activations Density 0.036%