INDEX
Explanations
occurrences of the article "an" or similar variations in different contexts
New Auto-Interp
Negative Logits
gren
-0.17
gue
-0.16
deo
-0.15
heid
-0.15
eenth
-0.15
entin
-0.15
ensen
-0.15
cue
-0.15
tant
-0.15
ted
-0.15
POSITIVE LOGITS
bang
0.17
ascimento
0.17
r
0.17
xious
0.17
gra
0.17
ointed
0.17
Ø©
0.17
ulled
0.16
onym
0.16
meld
0.16
Activations Density 0.052%