INDEX
Explanations
isolated words or parts of words ending with "ere"
the word "ere" in various contexts, suggesting a focus on the presence of that specific term
New Auto-Interp
Negative Logits
orously
-0.76
gag
-0.72
OWS
-0.71
ows
-0.70
acca
-0.68
edit
-0.67
å°Ĩ
-0.67
owed
-0.67
imates
-0.66
ished
-0.65
POSITIVE LOGITS
ndum
1.24
nda
0.95
tta
0.92
cht
0.91
ttes
0.89
lease
0.89
nder
0.86
bral
0.86
llan
0.85
cki
0.84
Activations Density 0.022%