INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    edly
    -0.69
    irection
    -0.68
    buffer
    -0.67
    ochond
    -0.67
     clipboard
    -0.65
    sworth
    -0.64
    orph
    -0.63
    ives
    -0.62
    addons
    -0.61
    OLOGY
    -0.61
    POSITIVE LOGITS
     Claus
    1.51
     Monica
    1.11
     Clara
    1.09
     Cruz
    1.05
     Barbara
    0.98
     Rosa
    0.98
     Ana
    0.98
     Clause
    0.93
     Clar
    0.92
    Cruz
    0.87
    Act Density 0.014%

    No Known Activations