INDEX
Explanations
phrases related to emphasizing points or stating facts
statements starting with "The fact that."
New Auto-Interp
Negative Logits
Interested
-0.63
Refer
-0.62
Annotations
-0.62
nas
-0.61
cart
-0.60
begin
-0.59
;;
-0.58
Caption
-0.58
Orig
-0.54
+.
-0.54
POSITIVE LOGITS
remains
0.97
oid
0.88
uality
0.85
begs
0.81
is
0.79
proves
0.77
oids
0.76
that
0.75
indicates
0.75
itious
0.73
Activations Density 0.045%