INDEX
    Explanations

    spending time learning or exploring

    New Auto-Interp
    Negative Logits
     "!
    0.47
     όταν
    0.45
     عندما
    0.43
     Lors
    0.40
     حتى
    0.40
     quando
    0.39
     पहली
    0.39
     когда
    0.39
     cuando
    0.38
     образа
    0.38
    POSITIVE LOGITS
     studying
    0.60
     exploring
    0.59
     investigating
    0.53
     discussing
    0.52
     analyzing
    0.52
     researching
    0.51
     preparing
    0.49
     constructing
    0.48
     revising
    0.47
    im
    0.46
    Act Density 0.016%

    No Known Activations