INDEX
Explanations
names of people
names of individuals or proper nouns
New Auto-Interp
Negative Logits
âĶĢâĶĢ
-0.88
ModLoader
-0.77
etheless
-0.75
theless
-0.72
underwater
-0.69
ãĥŁ
-0.68
Colossus
-0.67
Galileo
-0.67
LEASE
-0.64
ãĥİ
-0.64
POSITIVE LOGITS
ansky
1.08
atz
1.07
inski
1.05
lett
1.03
itz
1.02
insky
1.00
antz
1.00
anson
0.99
low
0.97
sell
0.97
Activations Density 0.278%