INDEX
    Explanations

    negations and limitations

    New Auto-Interp
    Negative Logits
     VERY
    0.72
    不仅
    0.58
     очень
    0.55
    不僅
    0.54
     nejen
    0.54
     неболь
    0.53
     both
    0.52
     very
    0.49
     небольшой
    0.49
     sehr
    0.49
    POSITIVE LOGITS
     quelconque
    1.16
     alcuna
    0.96
     ningún
    0.94
     quelcon
    0.92
     alcun
    0.91
     żad
    0.91
     tampoco
    0.88
     کوئی
    0.86
     irgende
    0.86
     nor
    0.85
    Act Density 0.110%

    No Known Activations