INDEX
Explanations
mentions of the word "squirrel."
references to caffeine and squirrels
New Auto-Interp
Negative Logits
fare
-0.75
Cuban
-0.65
Lens
-0.65
abeth
-0.64
FUL
-0.60
Citation
-0.60
ewski
-0.60
forge
-0.59
emate
-0.58
casts
-0.58
POSITIVE LOGITS
reys
0.93
isher
0.86
inated
0.81
irrel
0.80
iversity
0.80
berries
0.80
eree
0.79
inates
0.78
ohydrate
0.74
ished
0.74
Activations Density 0.022%