INDEX
    Explanations

    sentences summarizing conclusions or key points

    New Auto-Interp
    Negative Logits
    ãĥĺ
    -0.84
    roups
    -0.74
    ipel
    -0.72
     ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
    -0.70
    DOM
    -0.68
    TIT
    -0.68
    chlor
    -0.66
    legged
    -0.66
    leted
    -0.64
    itialized
    -0.63
    POSITIVE LOGITS
     takeaway
    0.98
     boils
    0.86
     message
    0.76
     verdict
    0.75
    nings
    0.73
     perspective
    0.72
    :
    0.72
     approach
    0.69
     nutshell
    0.69
     lesson
    0.68
    Act Density 0.056%

    No Known Activations