INDEX
Explanations
instances of the word "most."
New Auto-Interp
Negative Logits
Oise
-0.83
er
-0.80
quele
-0.73
menores
-0.69
Crowe
-0.67
Pau
-0.67
Boe
-0.66
io
-0.66
ه
-0.66
rubin
-0.65
POSITIVE LOGITS
most
1.58
most
1.51
MOST
1.35
MOST
1.29
Most
1.25
Most
1.15
meeste
0.97
plupart
0.95
moſt
0.93
meisten
0.91
Activations Density 0.075%