INDEX
Explanations
phrases indicating comparison or contrast in various contexts
New Auto-Interp
Negative Logits
ulp
-0.17
èĴĤ
-0.15
istics
-0.15
amba
-0.15
cÃłng
-0.14
Anthem
-0.14
ucht
-0.14
itals
-0.13
.weixin
-0.13
kako
-0.13
POSITIVE LOGITS
happened
0.29
happens
0.24
occurred
0.23
happ
0.23
during
0.22
with
0.22
occurs
0.22
happen
0.21
in
0.21
done
0.20
Activations Density 0.143%