INDEX
    Explanations

    phrases related to causing harm or negative consequences

    New Auto-Interp
    Negative Logits
     Tatsache
    -0.78
    קישורים
    -0.76
    paddingVertical
    -0.72
     للمعارف
    -0.67
    Adri
    -0.65
     strå
    -0.64
     Haller
    -0.63
    ppins
    -0.63
     mellitus
    -0.63
     Realität
    -0.62
    POSITIVE LOGITS
     cause
    1.39
     CAUSE
    1.37
     Caus
    1.36
     Causes
    1.35
     Cause
    1.35
    causes
    1.32
     causes
    1.30
     caused
    1.28
    caused
    1.26
    cause
    1.25
    Act Density 0.093%

    No Known Activations