INDEX
Explanations
prominent figures, such as physicists, psychologists, artists, and political commentators
terms related to professional roles or occupations
New Auto-Interp
Negative Logits
aber
-0.89
crew
-0.85
merga
-0.81
ramid
-0.81
word
-0.78
acus
-0.77
cale
-0.74
perm
-0.74
ivery
-0.74
assian
-0.72
POSITIVE LOGITS
extraord
0.99
Stephen
0.94
Laura
0.90
Shaun
0.89
Andrew
0.89
Tony
0.88
Richard
0.88
Carl
0.88
David
0.87
Todd
0.86
Activations Density 0.301%