INDEX
Explanations
phrases expressing strong impact or emphasis, such as 'come true' or 'pendent'
expressions indicating truth or reliability
New Auto-Interp
Negative Logits
ãĥ¼ãĥĨãĤ£
-0.58
omal
-0.56
ouched
-0.52
athered
-0.52
uable
-0.52
licted
-0.51
iva
-0.51
ilateral
-0.50
lict
-0.50
oreal
-0.50
POSITIVE LOGITS
:)
0.74
;)
0.72
:-)
0.62
¯
0.60
since
0.60
ðŁĻĤ
0.58
regarding
0.58
haha
0.57
:
0.57
↵↵
0.55
Activations Density 0.960%