INDEX
Explanations
proper nouns, particularly names of people
New Auto-Interp
Negative Logits
Female
-0.14
#(
-0.13
4
-0.13
-(
-0.13
/:
-0.13
/↵
-0.13
.EntityFramework
-0.13
#:
-0.13
()↵
-0.12
Duchess
-0.12
POSITIVE LOGITS
.
0.30
Justice
0.21
.]
0.16
ç¶ļ
0.15
opher
0.14
iven
0.14
.He
0.14
de
0.14
Justice
0.14
.Ab
0.14
Activations Density 0.088%