INDEX
    Explanations

    causal relationships and explanations, especially those starting with "because."

    New Auto-Interp
    Negative Logits
    MLLoader
    -0.73
    matchCondition
    -0.65
    invokeLater
    -0.64
     femininos
    -0.61
     houſe
    -0.60
     Verſ
    -0.60
     Houſe
    -0.58
     sentenza
    -0.57
    abestanden
    -0.56
     Monfieur
    -0.56
    POSITIVE LOGITS
     because
    0.84
    because
    0.68
     weil
    0.68
    Porque
    0.67
     porque
    0.64
     omdat
    0.63
     لأنه
    0.61
     karena
    0.60
     Porque
    0.59
    Because
    0.59
    Act Density 0.829%

    No Known Activations