INDEX
Explanations
words ending in ance, ence, iance
New Auto-Interp
Negative Logits
sh
-0.20
set
-0.19
so
-0.19
ness
-0.18
shift
-0.18
ship
-0.18
son
-0.18
name
-0.17
rec
-0.17
reg
-0.17
POSITIVE LOGITS
es
0.21
y
0.18
able
0.18
ously
0.18
ous
0.17
urs
0.15
ance
0.15
au
0.14
ois
0.13
e
0.13
Activations Density 0.174%