INDEX
Explanations
phrases and words that indicate comparison or contrast in context
New Auto-Interp
Negative Logits
().'/
-0.14
anny
-0.14
Natasha
-0.14
ideas
-0.14
Shepard
-0.14
]){-0.14
412
-0.13
æĦıæĢĿ
-0.13
ÑģÑĤоÑı
-0.13
itime
-0.13
POSITIVE LOGITS
typical
0.22
Typical
0.20
typically
0.17
Typically
0.16
_default
0.16
åħ¸
0.16
typ
0.15
_DEFAULT
0.15
.default
0.15
typically
0.15
Activations Density 0.004%