INDEX
Explanations
proper nouns from various contexts
examples of situations or entities that act as representations of broader concepts or themes
New Auto-Interp
Negative Logits
pora
-0.73
phal
-0.69
BSD
-0.67
Wan
-0.65
Immunity
-0.65
holiest
-0.65
Ward
-0.64
oats
-0.63
AMA
-0.63
hump
-0.62
POSITIVE LOGITS
ified
1.20
orer
1.16
ifier
1.09
ifies
1.08
ifying
1.03
ifiers
1.03
ification
1.03
ars
0.96
iment
0.93
eness
0.91
Activations Density 0.019%