INDEX
Explanations
requests for explanation or clarification in discourse
New Auto-Interp
Negative Logits
"]);
-0.74
DateFormat
-0.66
propOrder
-0.65
aarrggbb
-0.62
Où
-0.61
şört
-0.61
lengthen
-0.60
displayquote
-0.60
pylint
-0.59
قایناقلار
-0.59
POSITIVE LOGITS
people
0.86
People
0.84
People
0.77
PEOPLE
0.76
ppl
0.70
people
0.69
ludzi
0.68
pessoas
0.66
somebody
0.66
Somebody
0.65
Activations Density 0.314%