INDEX
Explanations
phrases related to playful teasing or provocation
references to playful or critical commentary, often using the word "poke."
New Auto-Interp
Negative Logits
ingham
-0.91
recy
-0.74
inction
-0.68
Percent
-0.67
icter
-0.66
Decl
-0.65
gray
-0.65
uther
-0.65
ifted
-0.63
eded
-0.63
POSITIVE LOGITS
poking
0.99
poke
0.92
holes
0.80
peek
0.80
ature
0.79
cheon
0.75
asso
0.72
recess
0.71
geon
0.71
poked
0.70
Activations Density 0.042%