INDEX
Explanations
contractions involving "are" and "you're."
New Auto-Interp
Negative Logits
xCD
-0.16
ilon
-0.16
itself
-0.15
çĿ
-0.14
idir
-0.14
sume
-0.14
PEND
-0.13
228
-0.13
rw
-0.13
ivan
-0.13
POSITIVE LOGITS
lucky
0.23
ever
0.22
nt
0.20
fortunate
0.18
unfamiliar
0.18
anywhere
0.17
unsure
0.17
planning
0.17
feeling
0.17
ucky
0.16
Activations Density 0.075%