INDEX
Explanations
phrases indicating guidance or instruction
requests and directives for action or compliance
New Auto-Interp
Negative Logits
)—
-0.84
"—
-0.81
Rodham
-0.77
ËĪ
-0.72
SAN
-0.71
\.
-0.71
—"
-0.69
slam
-0.66
AFP
-0.66
ickle
-0.65
POSITIVE LOGITS
Additionally
0.92
furthermore
0.88
Example
0.86
Additionally
0.85
additionally
0.85
Example
0.85
drawback
0.84
Examples
0.83
cknowled
0.81
exceptions
0.81
Activations Density 0.455%