INDEX
    Explanations

    phrases related to inability or difficulty in accomplishing tasks

    New Auto-Interp
    Negative Logits
    nis
    -0.15
    ASI
    -0.15
     Know
    -0.14
    etr
    -0.14
    Know
    -0.14
    eteria
    -0.14
     know
    -0.14
    curacy
    -0.13
     knows
    -0.13
    astes
    -0.13
    POSITIVE LOGITS
     find
    0.25
     figure
    0.23
    find
    0.22
     finds
    0.22
     stomach
    0.21
     FIND
    0.20
    æī¾åΰ
    0.19
    figure
    0.18
     Finds
    0.18
    .find
    0.18
    Act Density 0.130%

    No Known Activations