INDEX
Explanations
positive reviews and family references
possessive pronouns
New Auto-Interp
Negative Logits
myſelf
-1.07
reaſon
-1.04
EconPapers
-1.01
purpoſe
-1.01
poffible
-1.00
Anſ
-0.99
auffi
-0.98
pleaſure
-0.93
ſeveral
-0.91
themſelves
-0.91
POSITIVE LOGITS
0.54
sou
0.51
i
0.49
only
0.49
bo
0.49
“
0.48
‘
0.47
far
0.47
typeparam
0.47
k
0.46
Activations Density 0.544%