INDEX
Explanations
phrases related to evaluation or judgment
key terms and phrases indicating simplicity or clarity
New Auto-Interp
Negative Logits
ãĥ¼ãĥĨãĤ£
-0.52
grand
-0.51
ãĥij
-0.50
fet
-0.48
ãĥĭ
-0.47
teasp
-0.46
ãĤ¸
-0.46
ãĥ¼ãĥ
-0.45
pri
-0.45
utor
-0.45
POSITIVE LOGITS
;)
0.59
:)
0.59
because
0.53
!
0.52
;
0.52
:-)
0.51
when
0.51
:(
0.51
!!!!!
0.49
!,
0.48
Activations Density 1.725%