INDEX
    Explanations

    reasons or explanations behind beliefs and actions

    New Auto-Interp
    Negative Logits
    turned
    -0.50
     turned
    -0.47
     Efq
    -0.46
    LikeLike
    -0.46
    yelidikan
    -0.45
    BorderFactory
    -0.44
    illig
    -0.44
     anzeigen
    -0.43
     nij
    -0.43
    でしたか
    -0.41
    POSITIVE LOGITS
     why
    1.00
     weshalb
    0.91
    AndEndTag
    0.79
     فريبيس
    0.78
    makeConstraints
    0.77
    why
    0.77
    ViewFeatures
    0.76
     Deshalb
    0.75
     pourquoi
    0.74
    __":
    
    0.74
    Act Density 0.166%

    No Known Activations