INDEX
    Explanations

    questions asking how something works or how to accomplish a task

    New Auto-Interp
    Negative Logits
    cu
    -0.56
    -0.53
     [
    -0.53
    жен
    -0.52
    ci
    -0.49
    te
    -0.49
    [
    -0.48
    !
    -0.47
    bari
    -0.47
     fournir
    -0.46
    POSITIVE LOGITS
     how
    1.30
     itſelf
    1.28
     myſelf
    1.17
     कैसे
    1.17
     Nasıl
    1.17
     Hvordan
    1.15
     איך
    1.14
    Hvordan
    1.13
     hvordan
    1.12
     چگونه
    1.10
    Act Density 0.326%

    No Known Activations