INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    van
    -0.10
    enburg
    -0.09
    agn
    -0.09
    acci
    -0.09
    cles
    -0.09
    case
    -0.09
    gaard
    -0.09
     cann
    -0.09
    drawing
    -0.09
    umi
    -0.09
    POSITIVE LOGITS
    iors
    0.14
    ior
    0.14
    egal
    0.14
    rio
    0.14
    eca
    0.14
    Sen
    0.13
    IOR
    0.13
     Sen
    0.13
    ario
    0.12
     sen
    0.12
    Act Density 0.024%

    No Known Activations