INDEX
    Explanations

    proper nouns, particularly names of people

    New Auto-Interp
    Negative Logits
    Female
    -0.14
     #(
    -0.13
    4
    -0.13
    -(
    -0.13
     /:
    -0.13
     /↵
    -0.13
    .EntityFramework
    -0.13
     #:
    -0.13
     ()↵
    -0.12
     Duchess
    -0.12
    POSITIVE LOGITS
    .
    0.30
     Justice
    0.21
    .]
    0.16
    ç¶ļ
    0.15
    opher
    0.14
    iven
    0.14
    .He
    0.14
     de
    0.14
    Justice
    0.14
    .Ab
    0.14
    Act Density 0.088%

    No Known Activations