INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
     unmittelbar
    0.77
    fqsen
    0.74
    <unused2029>
    0.73
    <unused1887>
    0.72
    <unused1194>
    0.71
    <unused2140>
    0.71
    <unused2113>
    0.71
     unmittel
    0.71
    <unused373>
    0.71
    <unused184>
    0.70
    POSITIVE LOGITS
    ↵↵
    0.78
    0.69
    .
    0.66
    0.56
    Nested
    0.51
    0.51
    Constraints
    0.48
    Dataset
    0.47
    Check
    0.46
    Contains
    0.46
    Act Density 6.100%

    No Known Activations