INDEX
    Explanations

    science/physics

    New Auto-Interp
    Negative Logits
     classical
    -1.43
    classical
    -1.38
    Classical
    -1.23
     Classical
    -1.21
    classic
    -1.12
    ſelves
    -1.04
     pleaſure
    -1.01
     classic
    -0.95
     Shakspeare
    -0.95
     classiques
    -0.95
    POSITIVE LOGITS
    s
    0.82
    .
    0.60
    ,
    0.57
    e
    0.55
     Si
    0.51
     r
    0.49
    san
    0.49
    ся
    0.48
    tan
    0.48
    Si
    0.48
    Act Density 0.203%

    No Known Activations