INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Q
    -1.09
     q
    -0.66
     K
    -0.66
     V
    -0.62
     the
    -0.60
     fast
    -0.60
    -0.60
    ,
    -0.59
     and
    -0.57
     far
    -0.57
    POSITIVE LOGITS
     itſelf
    1.09
     Jefus
    1.00
     Shakspeare
    0.96
     Cæsar
    0.95
     nakalista
    0.95
     photolibrary
    0.94
     Mahomet
    0.94
     myſelf
    0.92
     Majefty
    0.90
     doubtnut
    0.90
    Act Density 0.085%

    No Known Activations