INDEX
Explanations
people's first instances of certain experiences or events
instances of the word "first" in various contexts
New Auto-Interp
Negative Logits
mund
-0.83
facts
-0.74
bara
-0.72
etics
-0.71
Canaver
-0.70
skirts
-0.70
bos
-0.68
holes
-0.68
ourge
-0.66
Vish
-0.66
POSITIVE LOGITS
foray
1.25
outing
0.97
responders
0.95
baseman
0.95
incarnation
0.90
attempt
0.89
glimpse
0.88
batch
0.87
glance
0.87
lady
0.86
Activations Density 0.046%