INDEX
    Explanations

    instances of the word "know" and its variations to identify conversational awareness and self-reference

    New Auto-Interp
    Negative Logits
    oses
    -0.18
    oola
    -0.17
    ẫ
    -0.16
    egra
    -0.15
     gratuites
    -0.15
    field
    -0.14
    asje
    -0.14
    zÅij
    -0.14
    coe
    -0.14
    gett
    -0.14
    POSITIVE LOGITS
    éĤ£ç§į
    0.15
     sometimes
    0.15
    ometimes
    0.14
    ÙĪØ§Ø±
    0.14
    üc
    0.14
    976
    0.14
    oming
    0.14
    ãģĵãģĿ
    0.14
    IED
    0.14
    ffset
    0.13
    Act Density 0.023%

    No Known Activations