INDEX
Explanations
articles and determiners preceding nouns or noun phrases
New Auto-Interp
Negative Logits
bor
-0.18
ough
-0.15
ss
-0.15
OKIE
-0.15
.Kind
-0.14
scriber
-0.14
oy
-0.14
owo
-0.14
ë³´
-0.14
ounter
-0.14
POSITIVE LOGITS
person
0.18
好çļĦ
0.17
good
0.16
recent
0.16
rouw
0.15
OSC
0.15
persons
0.15
rios
0.15
ularity
0.14
зÑĭ
0.14
Activations Density 0.172%