INDEX
Explanations
negations and warnings related to safety and health precautions
negative constraints/prohibitions ("not")
New Auto-Interp
Negative Logits
proved
-0.64
vindicated
-0.55
isPrime
-0.55
aculture
-0.55
didn
-0.54
didn
-0.54
correctly
-0.54
prove
-0.53
wasn
-0.52
correctly
-0.51
POSITIVE LOGITS
مرئيه
0.72
Rüyada
0.67
ContentAsync
0.67
disambiguazione
0.65
للاسماء
0.61
كومونز
0.59
GenerationType
0.57
üyada
0.57
ecore
0.56
exitRule
0.56
Activations Density 0.364%