INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    esehen
    0.70
    с
    0.67
    miner
    0.66
    rie
    0.65
    they
    0.64
    ight
    0.63
    ة
    0.62
    nett
    0.62
    rund
    0.62
    ranet
    0.62
    POSITIVE LOGITS
    el
    0.93
    as
    0.91
    ב
    0.88
     on
    0.80
    0.80
    on
    0.79
    0.76
    ed
    0.76
    0.76
    ר
    0.75
    Act Density 8.943%

    No Known Activations