INDEX
    Explanations

    phrases related to a particular person's name

    words related to specific character names or titles

    New Auto-Interp
    Negative Logits
     unacceptable
    -0.69
     unaff
    -0.68
     clubhouse
    -0.66
     friendship
    -0.64
     quicker
    -0.63
     signing
    -0.63
     unf
    -0.63
     liking
    -0.62
     mates
    -0.61
     improvement
    -0.61
    POSITIVE LOGITS
    chin
    2.79
    iren
    1.59
    GI
    1.57
    zhen
    1.38
    olin
    1.37
    rin
    1.31
    uchin
    1.24
     Chin
    1.14
    irin
    1.10
     veil
    1.08
    Act Density 0.019%

    No Known Activations