INDEX
Explanations
proper nouns and titles of various entities
articles that signal the introduction of nouns or noun phrases
New Auto-Interp
Negative Logits
anism
-0.78
Izan
-0.76
chops
-0.75
denies
-0.75
knows
-0.73
acknowledges
-0.71
chuk
-0.70
faces
-0.68
could
-0.68
listens
-0.68
POSITIVE LOGITS
versatile
0.89
unique
0.83
fascinating
0.83
prominent
0.83
standalone
0.82
significant
0.82
comprehensive
0.81
igmatic
0.80
wonderful
0.80
huge
0.80
Activations Density 0.187%