INDEX
    Explanations

    qualifications and contrasts

    New Auto-Interp
    Negative Logits
    Result
    0.21
     Gün
    0.20
     이는
    0.20
    Gün
    0.20
     Ef
    0.19
    ushroom
    0.19
     Мо
    0.19
    이는
    0.19
     这是
    0.19
     G
    0.19
    POSITIVE LOGITS
    albeit
    0.26
    但不
    0.25
     albeit
    0.20
    mainly
    0.20
     चाहें
    0.19
    특히
    0.18
    尤其
    0.17
     특히
    0.17
     mainly
    0.17
     soprattutto
    0.17
    Act Density 0.894%

    No Known Activations