INDEX
Explanations
names or terms related to various industries or fields of work
current trends and significant cultural or social themes
New Auto-Interp
Negative Logits
ãĥ¼ãĥĨãĤ£
-0.52
rely
-0.52
ĪĴ
-0.52
pload
-0.50
pse
-0.49
vice
-0.49
MIS
-0.48
rocal
-0.48
©¶æ
-0.44
eatures
-0.44
POSITIVE LOGITS
.
1.04
;
0.98
because
0.97
but
0.97
.[
0.92
,[
0.91
although
0.82
,''
0.82
whereas
0.78
despite
0.77
Activations Density 1.175%