INDEX
Explanations
phrases indicating a comparison or contrast in context
New Auto-Interp
Negative Logits
orgia
-0.15
leet
-0.15
arms
-0.13
µľ
-0.13
scribe
-0.13
.Meta
-0.13
{{--<-0.13
Chall
-0.13
.mj
-0.13
stype
-0.13
POSITIVE LOGITS
irit
0.16
ghan
0.14
odor
0.14
odox
0.14
disgr
0.13
ecta
0.13
Gig
0.13
758
0.13
cab
0.13
fulness
0.13
Activations Density 1.215%