INDEX
    Explanations

    attends to the general concepts or categories from specific instances related in a sentence

    New Auto-Interp
    Head Attr Weights
    0:0.06
    1:0.06
    2:0.06
    3:0.10
    4:0.06
    5:0.02
    6:0.38
    7:0.23
    Negative Logits
     pregunto
    -0.28
     думаете
    -0.28
    epresidente
    -0.27
    +:+
    -0.26
    ordum
    -0.26
    deelte
    -0.25
    pholes
    -0.25
    PointerException
    -0.25
    IonicModule
    -0.25
    pernicus
    -0.24
    POSITIVE LOGITS
     Anſ
    0.32
    ſelf
    0.32
     Monfieur
    0.32
    mste
    0.31
     either
    0.31
    ſelves
    0.31
    astify
    0.31
     Cæsar
    0.31
    idak
    0.30
    either
    0.30
    Act Density 0.854%

    No Known Activations