INDEX
Explanations
mentions of a large quantity of something
the repeated mention of the word "dozens" to indicate quantity
New Auto-Interp
Negative Logits
Lover
-0.65
fixation
-0.64
acqu
-0.64
position
-0.63
emb
-0.63
WB
-0.62
forward
-0.61
thus
-0.61
poses
-0.60
pose
-0.60
POSITIVE LOGITS
dozens
0.91
thousands
0.88
thous
0.87
dozen
0.86
Dozens
0.83
hundreds
0.82
ĸļ
0.81
tens
0.80
thousand
0.80
uously
0.80
Activations Density 0.005%