INDEX
    Explanations

    negating conditions or qualities

    New Auto-Interp
    Negative Logits
    失败
    0.40
     ikke
    0.39
     exploration
    0.38
     Few
    0.38
     मार्गदर्शन
    0.37
     stanza
    0.37
     não
    0.37
     Não
    0.36
    ที่ไม่
    0.36
     иной
    0.36
    POSITIVE LOGITS
     necessarily
    0.58
     overly
    0.56
     inherently
    0.55
     allowed
    0.54
    hin
    0.54
     currently
    0.53
    necessarily
    0.51
     obstante
    0.50
    orious
    0.50
     assolutamente
    0.49
    Act Density 0.140%

    No Known Activations