INDEX
Explanations
instances of individuals achieving firsts or breaking barriers in various contexts
New Auto-Interp
Negative Logits
enge
-0.16
ãĤ¤ãĥī
-0.15
674
-0.15
ãĥªãĤ¹
-0.15
merce
-0.15
ayet
-0.15
ornings
-0.15
ninger
-0.14
ocker
-0.14
ackson
-0.14
POSITIVE LOGITS
ever
0.36
-ever
0.30
ever
0.28
jamais
0.23
woman
0.22
Ever
0.21
person
0.21
EVER
0.21
male
0.21
female
0.21
Activations Density 0.051%