INDEX
Explanations
phrases related to providing explanations or justifications
punctuation, specifically commas
New Auto-Interp
Negative Logits
ãģ®
-0.68
Ö¼
-0.66
yna
-0.65
rill
-0.64
Pont
-0.62
ISE
-0.62
èĢħ
-0.61
EO
-0.60
igan
-0.60
ãģ®å
-0.60
POSITIVE LOGITS
nevertheless
1.16
nonetheless
0.89
there
0.80
there
0.75
namely
0.73
retaining
0.71
it
0.71
alas
0.70
albeit
0.69
suffice
0.68
Activations Density 0.140%