INDEX
    Explanations

    attends to tokens related to specific classes or categories from specific tokens related to their attributes or functions

    New Auto-Interp
    Head Attr Weights
    0:0.41
    1:0.26
    2:0.08
    3:0.04
    4:0.04
    5:0.03
    6:0.03
    7:0.06
    Negative Logits
     nakalista
    -0.46
     noDo
    -0.40
    DockStyle
    -0.39
    ANTLR
    -0.35
     المعيارى
    -0.34
    VIAF
    -0.34
    bewerken
    -0.33
     pousse
    -0.31
     resourceCulture
    -0.30
    resizingMask
    -0.30
    POSITIVE LOGITS
    YON
    0.31
     ſmall
    0.30
    0.30
     InputDecoration
    0.28
    lacer
    0.28
     wind
    0.27
     OMITBAD
    0.27
    phyllum
    0.27
     Hamm
    0.27
    حاد
    0.27
    Act Density 0.870%

    No Known Activations