INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -arrow
    -0.07
    LR
    -0.07
     Closure
    -0.07
    _sw
    -0.06
     XII
    -0.06
     Moses
    -0.06
    rvé
    -0.06
     axe
    -0.06
    Closure
    -0.06
    cors
    -0.06
    POSITIVE LOGITS
     notebook
    0.09
     Notebook
    0.08
     notebooks
    0.07
    .Load
    0.07
    0.07
     paused
    0.07
    entials
    0.06
    iaz
    0.06
    _book
    0.06
     gravid
    0.06
    Act Density 0.004%

    No Known Activations