INDEX
    Explanations

    statements about the consequences or results of actions

    New Auto-Interp
    Negative Logits
     quindi
    -0.15
     THEN
    -0.14
     kaldı
    -0.14
     billeder
    -0.14
    usk
    -0.13
     hence
    -0.13
    iyim
    -0.13
     então
    -0.13
    окÑĢема
    -0.13
     ÙĦذا
    -0.13
    POSITIVE LOGITS
     although
    0.35
     while
    0.35
     when
    0.35
     since
    0.33
     during
    0.31
     unlike
    0.29
     despite
    0.29
    when
    0.28
     if
    0.27
     whereas
    0.27
    Act Density 0.935%

    No Known Activations