INDEX
Explanations
statements expressing beliefs or opinions about integrity and assessments
New Auto-Interp
Negative Logits
ophora
-0.48
rains
-0.47
Xuân
-0.47
sabar
-0.47
vow
-0.46
⚭
-0.46
schöne
-0.46
zerw
-0.46
umumkan
-0.46
doulou
-0.45
POSITIVE LOGITS
featureID
0.76
كومونز
0.72
endphp
0.69
Personendaten
0.68
Geplaatst
0.67
BufferException
0.65
usercontent
0.63
NSCoder
0.61
batore
0.60
estekak
0.60
Activations Density 0.412%