INDEX
Explanations
names and initials associated with people
proper nouns and significant names
New Auto-Interp
Negative Logits
Bots
-0.78
Cody
-0.76
Vera
-0.75
avorite
-0.73
ãĥ¯
-0.71
metic
-0.70
ortunately
-0.70
warr
-0.68
ĸļ
-0.67
earthqu
-0.67
POSITIVE LOGITS
^
0.88
iac
0.79
ais
0.76
uth
0.75
HY
0.75
UCT
0.75
aug
0.73
agram
0.73
oyal
0.72
alle
0.72
Activations Density 0.233%