INDEX
    Explanations

    proper nouns, particularly names of characters and titles from stories or media

    New Auto-Interp
    Negative Logits
    ALLY
    -0.16
    _EOL
    -0.16
    Spoiler
    -0.15
    iaux
    -0.15
    beg
    -0.15
    hire
    -0.15
    _finalize
    -0.15
    ÑĪÑĮ
    -0.14
    gaard
    -0.14
    argas
    -0.14
    POSITIVE LOGITS
    uki
    0.17
     B
    0.15
    etti
    0.15
     Harris
    0.14
    achi
    0.13
     mine
    0.13
     M
    0.13
     lt
    0.13
    irie
    0.13
    Vo
    0.13
    Act Density 0.030%

    No Known Activations