INDEX
    Explanations

    dashboard, summit, dataset, workshops

    New Auto-Interp
    Negative Logits
     as
    0.80
    1
    0.76
    d
    0.75
    ного
    0.64
    nya
    0.63
    yle
    0.61
    dan
    0.61
    ning
    0.59
    ike
    0.58
    ل
    0.58
    POSITIVE LOGITS
     équipes
    0.76
     oreilles
    0.72
    이드
    0.71
    ၀၀
    0.70
    )$.
    0.70
    0.70
     μέρος
    0.68
    0.67
     aplikace
    0.67
    \,\
    0.66
    Act Density 0.074%

    No Known Activations