INDEX
Explanations
phrases that denote quantity or comparison
references to multiple instances or occurrences of similar items or events
New Auto-Interp
Negative Logits
Akin
-0.80
matter
-0.72
Emb
-0.72
john
-0.67
Ezek
-0.66
mitochond
-0.63
Lov
-0.63
embed
-0.62
Amon
-0.60
borough
-0.60
POSITIVE LOGITS
quir
0.78
finalists
0.76
additions
0.70
favorites
0.69
oggle
0.69
contenders
0.66
renches
0.66
quished
0.63
factors
0.63
phabet
0.63
Activations Density 0.077%