INDEX
    Explanations

    pronouns and relative clauses

    New Auto-Interp
    Negative Logits
    addContainerGap
    -0.54
    udan
    -0.52
    arische
    -0.51
    ocyclic
    -0.48
     kwal
    -0.48
     peoples
    -0.46
     hoods
    -0.45
    OGND
    -0.45
    یریت
    -0.45
    PDO
    -0.44
    POSITIVE LOGITS
    ########.
    1.03
    rrggbb
    0.79
    enumi
    0.76
    ]")]
    0.72
     autorytatywna
    0.68
    tdessen
    0.68
    ++]
    0.67
    woordig
    0.67
     kaynağından
    0.67
     '\\;'
    0.66
    Act Density 0.186%

    No Known Activations