INDEX
Explanations
names or partial names containing "K" followed by a high-activation-value vowel like 'a', 'o', or 'u'
proper nouns, specifically names
New Auto-Interp
Negative Logits
IBLE
-0.80
ãĥ¼ãĥĨãĤ£
-0.69
Ö¼
-0.68
天
-0.66
ional
-0.64
eric
-0.62
à¨
-0.62
UCT
-0.60
د
-0.59
س
-0.58
POSITIVE LOGITS
inski
0.98
lyak
0.94
lein
0.90
lov
0.90
lar
0.90
insky
0.88
laus
0.86
patrick
0.86
istani
0.83
anski
0.83
Activations Density 0.077%