INDEX
Explanations
phrases that indicate accuracy and responsibility in statements
New Auto-Interp
Negative Logits
betweenstory
-0.92
yntaxException
-0.79
YourGuide
-0.77
AssemblyTitle
-0.74
devamını
-0.73
RegressionTest
-0.72
bbero
-0.71
الحره
-0.70
numerusform
-0.70
migrationBuilder
-0.70
POSITIVE LOGITS
enough
0.98
enough
0.68
UnsafeEnabled
0.65
and
0.59
isSuccessful
0.56
מאוד
0.56
átní
0.54
bright
0.51
low
0.49
Smal
0.49
Activations Density 0.915%