INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     genotype
    -0.77
    zados
    -0.75
     LaSalle
    -0.72
    ález
    -0.70
     Cartoon
    -0.68
     Shen
    -0.68
     Toner
    -0.68
    ří
    -0.67
     dallo
    -0.66
     stipulated
    -0.65
    POSITIVE LOGITS
    Architect
    0.77
     controller
    0.72
    service
    0.71
     Architect
    0.70
     service
    0.68
     złot
    0.68
    牛奶
    0.67
    glBindBuffer
    0.67
    тері
    0.66
    eter
    0.64
    Act Density 0.112%

    No Known Activations