INDEX
Explanations
proper nouns, specifically names of individuals
New Auto-Interp
Negative Logits
kefeller
-0.71
enegger
-0.70
helle
-0.68
corners
-0.67
pton
-0.66
ppers
-0.62
foreseen
-0.61
hern
-0.61
mallow
-0.61
aukee
-0.60
POSITIVE LOGITS
Pacific
0.79
CLS
0.72
azi
0.69
adesh
0.69
¥µ
0.68
MIT
0.67
Blaze
0.67
awa
0.64
°
0.63
TPPStreamerBot
0.63
Activations Density 0.155%