INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    uction
    0.88
    ukiyoe
    0.86
     heightened
    0.83
    toctree
    0.82
    ד
    0.82
     houd
    0.81
    סה
    0.81
    ienced
    0.80
    schaft
    0.80
    ここに
    0.79
    POSITIVE LOGITS
     ​​
    1.09
    1.07
    1.01
    V
    0.99
     vera
    0.97
     Infantil
    0.97
    urp
    0.96
     Libro
    0.96
    б
    0.96
    Vi
    0.93
    Act Density 0.001%

    No Known Activations