INDEX
    Explanations

    references to a specific person or character named "He."

    New Auto-Interp
    Negative Logits
    omial
    -0.18
    weep
    -0.15
    th
    -0.15
    y
    -0.15
    liv
    -0.14
    ãģ£ãģ¨
    -0.14
    ety
    -0.14
    onne
    -0.14
    mag
    -0.14
    yen
    -0.14
    POSITIVE LOGITS
    idelberg
    0.22
    isman
    0.22
    bron
    0.22
    brew
    0.21
    imat
    0.21
    inz
    0.20
    /she
    0.20
    aviest
    0.20
    fce
    0.20
    avit
    0.18
    Act Density 0.018%

    No Known Activations