INDEX
Explanations
the possessive form of nouns
instances of the letter "s"
New Auto-Interp
Negative Logits
agos
-0.70
ey
-0.63
isch
-0.62
doms
-0.59
izzard
-0.59
esm
-0.58
trust
-0.58
cake
-0.57
Wars
-0.56
Archdemon
-0.56
POSITIVE LOGITS
why
0.92
how
0.89
whats
0.88
another
0.88
what
0.86
excerpts
0.80
how
0.78
omething
0.77
recap
0.76
some
0.75
Activations Density 0.022%