INDEX
Explanations
instances where the word "say" is used in a sentence
New Auto-Interp
Negative Logits
obal
-0.80
aughs
-0.76
leeve
-0.74
Justice
-0.70
ctic
-0.68
andem
-0.68
ament
-0.65
kefeller
-0.65
destro
-0.65
ressive
-0.64
POSITIVE LOGITS
goodbye
0.81
lihood
0.80
parts
0.76
hello
0.72
Pearce
0.65
Schr
0.63
Nab
0.60
Angela
0.60
Bacon
0.60
Dunk
0.60
Activations Density 0.020%