INDEX
Explanations
embark, embarrass, embellish, imbue, embassy
New Auto-Interp
Negative Logits
oh
-0.11
yx
-0.10
oi
-0.10
y
-0.09
ern
-0.09
ergic
-0.09
utton
-0.09
fully
-0.09
bsp
-0.09
OUN
-0.09
POSITIVE LOGITS
emb
0.21
Emb
0.20
emb
0.15
Emb
0.13
assy
0.12
argo
0.11
roid
0.11
ayment
0.10
embark
0.10
èĥİ
0.10
Activations Density 0.020%