INDEX
    Explanations

    theory and practical knowledge

    New Auto-Interp
    Negative Logits
     sciatica
    0.75
    0.75
    теп
    0.73
    ных
    0.73
    тации
    0.71
    ва
    0.70
     качество
    0.69
     впечатление
    0.67
     dục
    0.67
     влияние
    0.66
    POSITIVE LOGITS
    Element
    0.96
    et
    0.95
    2
    0.93
    Theory
    0.91
    us
    0.89
    א
    0.88
     Theoretical
    0.87
    తో
    0.86
    y
    0.85
    w
    0.85
    Act Density 0.018%

    No Known Activations