INDEX
Explanations
phrases discussing contrasts or comparisons between two or more items or concepts
New Auto-Interp
Negative Logits
rose
-0.20
shed
-0.15
igram
-0.15
ìĦ±ìĿĦ
-0.15
.Dispatch
-0.15
lio
-0.15
Gesture
-0.14
/read
-0.14
agma
-0.14
dur
-0.14
POSITIVE LOGITS
iating
0.21
iator
0.21
/error
0.20
between
0.20
iale
0.18
iable
0.18
iative
0.18
ially
0.16
/div
0.16
icult
0.16
Activations Density 0.039%