INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    itu
    -0.69
    aspers
    -0.61
    creen
    -0.61
    ussen
    -0.60
    addon
    -0.60
     Horus
    -0.60
    abuse
    -0.58
     leveled
    -0.58
    hold
    -0.58
    ivari
    -0.58
    POSITIVE LOGITS
    lems
    0.94
    verning
    0.81
     viral
    0.79
    ggle
    0.77
    vt
    0.72
    ogue
    0.72
     limp
    0.70
    ODY
    0.69
    eker
    0.68
    og
    0.68
    Act Density 0.035%

    No Known Activations