INDEX
Explanations
mentions of various foods and ingredients
the letter "p" in various contexts throughout the text
New Auto-Interp
Negative Logits
EDITION
-0.81
compe
-0.76
diplom
-0.71
horizont
-0.70
boarding
-0.66
resumes
-0.64
Geral
-0.61
directed
-0.60
differe
-0.60
juggling
-0.59
POSITIVE LOGITS
udding
1.39
udd
1.36
uddle
1.34
anda
1.25
uddin
1.24
rawn
1.24
uffer
1.22
uffy
1.21
oodle
1.20
ixie
1.19
Activations Density 0.036%