INDEX
Explanations
the word "tens" followed by any number
phrases related to large quantities of people or items
New Auto-Interp
Negative Logits
Shrine
-0.64
mint
-0.64
Colony
-0.64
agate
-0.63
Boards
-0.60
Manifest
-0.60
commentary
-0.60
adv
-0.59
Slayer
-0.59
messenger
-0.58
POSITIVE LOGITS
omet
1.00
eteen
0.99
ourcing
0.97
elfth
0.96
atile
0.96
eenth
0.95
imet
0.92
eteenth
0.90
umbing
0.87
atility
0.87
Activations Density 0.013%