INDEX
Explanations
mentions of medical conditions and related terminology
New Auto-Interp
Negative Logits
כ
-1.55
הכ
-0.62
כ
-0.58
וכ
-0.51
בכ
-0.48
minecraftforge
-0.41
mtrl
-0.41
כח
-0.40
πο
-0.40
unc
-0.39
POSITIVE LOGITS
featureID
0.54
rungsseite
0.52
resourceCulture
0.52
tvguidetime
0.52
numerusform
0.52
Италијани
0.50
########.
0.50
فريبيس
0.48
Jefus
0.47
againſt
0.45
Activations Density 0.000%