INDEX
Explanations
the word "rie" with varying activation strengths
mentions of the name "Berie" in various contexts
New Auto-Interp
Negative Logits
idious
-0.82
iop
-0.78
tnc
-0.72
rity
-0.71
imir
-0.69
agnetic
-0.69
ional
-0.67
agons
-0.66
worldly
-0.66
iary
-0.65
POSITIVE LOGITS
rie
1.03
vre
0.99
ves
0.93
ve
0.85
bers
0.82
ving
0.82
ptions
0.82
pton
0.81
zo
0.80
bs
0.80
Activations Density 0.007%