INDEX
Explanations
instances of the letter 'A' at the beginning of phrases or sentences
New Auto-Interp
Negative Logits
-0.18
usal
-0.15
233
-0.14
rather
-0.14
etc
-0.13
closely
-0.13
thus
-0.13
mars
-0.13
323
-0.13
somewhat
-0.13
POSITIVE LOGITS
itionally
0.17
lot
0.17
istically
0.17
entimes
0.16
Lot
0.15
edis
0.15
ãĤĵãģ©
0.15
unya
0.15
-wise
0.15
obody
0.15
Activations Density 0.038%