INDEX
Explanations
references to the word "Sugar"
mentions of "Sugar" and "Sweet" in various contexts
New Auto-Interp
Negative Logits
posit
-0.78
regener
-0.76
dh
-0.73
REM
-0.73
commun
-0.73
foc
-0.71
detach
-0.71
ax
-0.71
radi
-0.71
cath
-0.70
POSITIVE LOGITS
Sugar
3.51
Sug
1.98
Sweet
1.90
Cotton
1.81
ugar
1.70
Sweet
1.63
Candy
1.40
Grape
1.36
Soda
1.35
Honey
1.29
Activations Density 0.029%