INDEX
Explanations
conjunctions and connectors within sentences
New Auto-Interp
Negative Logits
atr
-0.16
imon
-0.14
AREN
-0.13
isko
-0.13
avery
-0.13
diam
-0.13
mons
-0.13
arella
-0.13
cek
-0.13
roys
-0.13
POSITIVE LOGITS
all
0.21
etc
0.20
etc
0.18
all
0.15
âng
0.14
oken
0.14
others
0.14
uzey
0.14
.define
0.14
burgh
0.14
Activations Density 0.116%