INDEX
    Explanations

    concepts related to conditional phrases and causal relationships

    New Auto-Interp
    Negative Logits
     myſelf
    -0.74
     ſeveral
    -0.72
    ſelves
    -0.69
    éroport
    -0.67
     himſelf
    -0.67
     Theſe
    -0.66
    NewLabel
    -0.66
     itſelf
    -0.65
     الدولى
    -0.65
    DoubleQuotes
    -0.64
    POSITIVE LOGITS
     because
    1.04
    because
    0.96
     Sebab
    0.92
     Because
    0.89
    Because
    0.87
     perché
    0.85
     Ведь
    0.84
    畢竟
    0.83
     BECAUSE
    0.83
     perchè
    0.82
    Act Density 0.303%

    No Known Activations