INDEX
Explanations
phrases that involve numerical quantities or counts
New Auto-Interp
Negative Logits
Athenians
-0.70
chevalier
-0.68
fédé
-0.62
Ordovician
-0.62
houſe
-0.62
depositors
-0.61
soldat
-0.61
galerie
-0.61
savages
-0.60
chimique
-0.59
POSITIVE LOGITS
two
0.91
three
0.82
major
0.80
different
0.80
separate
0.76
new
0.75
prominent
0.75
other
0.74
mini
0.73
dozen
0.72
Activations Density 0.495%