INDEX
    Explanations

    instances where "mind" or thoughts are mentioned in various contexts

    New Auto-Interp
    Negative Logits
    irements
    -0.69
    hered
    -0.65
    ilon
    -0.62
    imer
    -0.61
    itud
    -0.60
    elin
    -0.60
    otta
    -0.59
    mouth
    -0.59
     Cod
    -0.59
     rollout
    -0.59
    POSITIVE LOGITS
    anza
    0.66
     è£ıè
    0.65
     briefly
    0.64
    ffer
    0.63
     wondering
    0.62
    ovie
    0.59
    oldown
    0.58
    scape
    0.58
    chwitz
    0.58
     when
    0.57
    Act Density 0.012%

    No Known Activations