INDEX
    Explanations

    assertions of understanding, comprehension, and explanations of concepts

    New Auto-Interp
    Negative Logits
    jupiter
    -0.64
    isholm
    -0.61
    userManager
    -0.59
     perdana
    -0.56
    datagrid
    -0.56
    Notable
    -0.55
    gestone
    -0.55
    Aholisi
    -0.52
    iertamente
    -0.51
    ittarius
    -0.51
    POSITIVE LOGITS
     why
    1.19
     clearly
    0.92
     how
    0.91
    clearly
    0.84
     fully
    0.83
     concepts
    0.81
     Clearly
    0.81
    why
    0.79
    Clearly
    0.79
    ably
    0.77
    Act Density 0.156%

    No Known Activations