INDEX
Explanations
proper nouns, specifically names of people
specific names and representations of artistic works or performances
New Auto-Interp
Negative Logits
obi
-0.87
ipper
-0.82
Cub
-0.82
apo
-0.81
omez
-0.80
ippers
-0.80
aders
-0.77
orney
-0.77
Gb
-0.76
é»Ĵ
-0.75
POSITIVE LOGITS
Sat
1.64
Stand
1.49
Sit
1.43
sat
1.42
sat
1.40
STAND
1.36
Sit
1.34
sit
1.33
Stand
1.31
stand
1.30
Activations Density 0.239%