INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    on
    0.54
    evidence
    0.45
    et
    0.44
    BEL
    0.44
    linear
    0.44
    echo
    0.44
    lbrace
    0.42
    to
    0.42
     назвал
    0.42
    multif
    0.42
    POSITIVE LOGITS
     solito
    0.57
    ;
    0.54
    ре
    0.51
     selaku
    0.50
    ،
    0.48
     saja
    0.48
     mesmos
    0.47
     самих
    0.47
    0.46
    0.46
    Act Density 0.065%

    No Known Activations