INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     помочь
    0.49
     помога
    0.44
     realmente
    0.43
     really
    0.43
     khiến
    0.43
     truly
    0.41
     actively
    0.40
     réellement
    0.40
     aiutare
    0.40
     genuinely
    0.40
    POSITIVE LOGITS
    不但
    1.12
    不仅
    1.02
    不僅
    0.88
    一方面
    0.79
     nejen
    0.75
     একদিকে
    0.66
    0.66
    一是
    0.58
     både
    0.57
     zowel
    0.57
    Act Density 0.048%

    No Known Activations