INDEX
Explanations
instances of the word "a"
instances of the article "a" or similar indefinite articles
New Auto-Interp
Negative Logits
withd
-0.59
"}],"
-0.59
itor
-0.58
DragonMagazine
-0.58
ulous
-0.57
atis
-0.56
jit
-0.56
"$:/
-0.55
irin
-0.53
esan
-0.53
POSITIVE LOGITS
a
1.49
an
1.28
another
1.15
something
0.97
some
0.85
the
0.83
someone
0.77
any
0.72
alot
0.71
their
0.70
Activations Density 1.030%