INDEX
    Explanations

    references to members of the royal family, specifically the word "Prince"

    New Auto-Interp
    Negative Logits
    anmar
    -0.17
    ctype
    -0.16
     povol
    -0.15
    isoft
    -0.15
    agogue
    -0.15
    memberOf
    -0.15
    RITE
    -0.15
     Garland
    -0.15
     IBOutlet
    -0.14
     пÑĢиÑĤ
    -0.14
    POSITIVE LOGITS
    (ss
    0.25
    /ss
    0.24
    esses
    0.23
    ps
    0.22
    ess
    0.20
     Consort
    0.20
     charming
    0.19
    essa
    0.19
    ippet
    0.18
    еÑģÑģ
    0.18
    Act Density 0.009%

    No Known Activations