INDEX
Explanations
proper nouns
the phrase "one of," frequently indicating membership or classification within a larger group or category
New Auto-Interp
Negative Logits
zeb
-0.93
anse
-0.65
lees
-0.63
matter
-0.62
ption
-0.60
au
-0.59
%:
-0.58
IRE
-0.58
istine
-0.58
claimed
-0.57
POSITIVE LOGITS
several
1.16
dozens
1.05
few
1.03
four
1.02
five
1.02
many
1.02
seven
1.00
three
1.00
eight
0.99
nine
0.99
Activations Density 0.056%