INDEX
Explanations
reasons or causes for actions or decisions
causal phrases that indicate reasons or justifications
New Auto-Interp
Negative Logits
bage
-0.74
Anyway
-0.70
cells
-0.69
ail
-0.68
Anyway
-0.67
mber
-0.66
igr
-0.66
inis
-0.65
Plug
-0.65
Dro
-0.64
POSITIVE LOGITS
resemblance
0.73
precedent
0.73
anecd
0.71
historically
0.69
unlike
0.67
"[
0.67
concerns
0.66
initially
0.65
resemb
0.64
it
0.64
Activations Density 0.255%