INDEX
    Explanations

    phrases indicating relationships between cause and effect

    New Auto-Interp
    Negative Logits
    :
    -0.55
    .
    -0.49
    ;
    -0.49
    ocardium
    -0.49
    hören
    -0.48
    grao
    -0.47
    ::
    -0.46
    !
    -0.45
    magna
    -0.44
    خرى
    -0.42
    POSITIVE LOGITS
    NUMX
    0.77
     cherchés
    0.75
     محفوظة
    0.75
     समीक्षाओं
    0.74
    BufferException
    0.73
     nahilalakip
    0.73
     oprot
    0.72
     Мексичка
    0.71
    GEBURTSDATUM
    0.71
     defaultstate
    0.71
    Act Density 0.311%

    No Known Activations