INDEX
    Explanations

    quantities expressed as percentages

    references to the quantity "half."

    New Auto-Interp
    Negative Logits
    andr
    -0.59
    licts
    -0.53
     rul
    -0.53
    sed
    -0.52
    andi
    -0.51
     dstg
    -0.50
     laun
    -0.50
    edIn
    -0.49
     convol
    -0.49
     condem
    -0.48
    POSITIVE LOGITS
     of
    0.93
    heartedly
    0.80
     thereof
    0.76
    way
    0.73
     the
    0.72
    wheel
    0.71
    ibaba
    0.70
    OF
    0.68
    hearted
    0.68
    terness
    0.67
    Act Density 0.045%

    No Known Activations