INDEX
    Explanations

    elements of high magnitude or significance in the text

    New Auto-Interp
    Negative Logits
     stagger
    -0.52
    aner
    -0.52
     lipat
    -0.50
    にも
    -0.50
    subplots
    -0.50
     mực
    -0.49
    Carriera
    -0.49
    clothing
    -0.48
    dafx
    -0.48
     بلکه
    -0.48
    POSITIVE LOGITS
    Cyfeiriadau
    0.75
    ंदीखरीदारी
    0.69
    ))));
    0.67
     للمعارف
    0.66
     AttributeSet
    0.65
    );*/
    0.64
    ")));
    0.64
    ,:);
    0.63
    ):=
    0.63
    AutoScaleMode
    0.62
    Act Density 0.069%

    No Known Activations