INDEX
Explanations
names which could be first and last names
proper names, specifically names likely related to prominent people
New Auto-Interp
Negative Logits
theless
-0.81
RIC
-0.59
ãĤº
-0.58
Versions
-0.58
Gems
-0.56
Pixie
-0.56
mining
-0.55
âĶĢâĶĢ
-0.55
âķIJ
-0.54
å§«
-0.54
POSITIVE LOGITS
ensen
0.66
ohan
0.63
atoon
0.62
Gaal
0.61
EStream
0.59
Keefe
0.57
anyon
0.57
¥µ
0.56
zyme
0.55
phabet
0.55
Activations Density 0.122%