INDEX
Explanations
words related to proper nouns
words related to forms or structures
New Auto-Interp
Negative Logits
prints
-0.73
Decker
-0.68
Rico
-0.66
lder
-0.65
Maya
-0.64
Fas
-0.64
Fey
-0.63
-0.62
Absent
-0.61
cho
-0.61
POSITIVE LOGITS
orph
1.04
ont
0.94
etheus
0.89
orm
0.88
ingo
0.87
ammu
0.86
ontent
0.85
inational
0.85
orthy
0.85
yss
0.85
Activations Density 0.016%