INDEX
Explanations
phrases expressing the importance of specific situations or actions
explanations and justifications in a text
New Auto-Interp
Negative Logits
apsed
-0.72
inis
-0.68
lez
-0.67
fml
-0.65
scr
-0.65
inse
-0.64
floor
-0.61
ascus
-0.60
inate
-0.59
tyr
-0.59
POSITIVE LOGITS
*/(
0.73
Firstly
0.72
ecause
0.71
][
0.71
unlike
0.69
suppose
0.68
although
0.68
evidenced
0.65
"[
0.64
neither
0.63
Activations Density 0.241%