INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     AssemblyCompany
    -0.96
    OGND
    -0.90
    principalColumn
    -0.90
    ftagPool
    -0.88
     CreateTagHelper
    -0.88
    AddTagHelper
    -0.87
     nahilalakip
    -0.87
    MessageTagHelper
    -0.83
     محفوظة
    -0.82
    findpost
    -0.82
    POSITIVE LOGITS
    ;
    0.67
    .*;
    0.44
    .;
    0.43
    );
    0.38
    .
    0.36
     ;
    0.35
    ,
    0.35
    $;
    0.33
    *;
    0.33
    };
    0.32
    Act Density 0.001%

    No Known Activations