INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    SharedDtor
    -1.32
     Roskov
    -1.30
    InjectAttribute
    -1.27
     feroit
    -1.24
     متعلقه
    -1.23
     enfans
    -1.23
     resourceCulture
    -1.19
     quæ
    -1.17
    tagHelperRunner
    -1.16
    DockStyle
    -1.15
    POSITIVE LOGITS
      
    0.71
    0.69
    0.65
     (
    0.64
     l
    0.62
    ↵↵
    0.60
    .
    0.59
    ity
    0.58
     A
    0.57
     standard
    0.57
    Act Density 0.660%

    No Known Activations