INDEX
    Explanations

    negatively affects or opposes

    New Auto-Interp
    Negative Logits
     Okay
    0.40
     Better
    0.39
     Absence
    0.38
     Add
    0.35
     okay
    0.34
     absence
    0.34
     X
    0.33
     Te
    0.33
     Partial
    0.33
     जुड़
    0.33
    POSITIVE LOGITS
     undermines
    0.50
    正常的
    0.49
     undermine
    0.46
     unnecessarily
    0.46
     sanctity
    0.46
     needlessly
    0.46
     undermining
    0.43
    原本
    0.43
     précieux
    0.42
     innocence
    0.42
    Act Density 0.268%

    No Known Activations