INDEX
    Explanations

    numerical answers and final results

    New Auto-Interp
    Negative Logits
    ycop
    0.68
    фель
    0.68
     G
    0.66
    G
    0.65
    0.64
     полі
    0.62
     Mother
    0.61
    gre
    0.61
     Da
    0.61
    ґ
    0.59
    POSITIVE LOGITS
    Anne
    0.58
     strip
    0.55
    0.54
    strip
    0.53
     പ്രവർത്തന
    0.52
    SL
    0.51
    änge
    0.51
    swa
    0.51
     strips
    0.50
    0.50
    Act Density 0.094%

    No Known Activations