INDEX
Explanations
mentions of well-known figures in various contexts
positive attributes and characteristics of individuals, especially in the context of their reputation
New Auto-Interp
Negative Logits
olutions
-0.69
Solution
-0.67
unicip
-0.66
dispers
-0.64
phrine
-0.63
ļéĨĴ
-0.63
downstream
-0.63
incent
-0.63
olving
-0.62
Implementation
-0.62
POSITIVE LOGITS
his
1.35
he
1.26
His
1.17
his
1.13
Born
1.06
His
1.04
him
1.03
Born
1.02
biography
0.95
HIS
0.92
Activations Density 0.753%