INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ſelf
    -0.61
     Cæsar
    -0.60
    ſelves
    -0.58
     Diony
    -0.57
     snippetHide
    -0.56
     entanto
    -0.54
     mandal
    -0.52
     MDC
    -0.52
     EFE
    -0.52
     כך
    -0.52
    POSITIVE LOGITS
     it
    1.17
     there
    1.02
     we
    1.01
     they
    0.93
     I
    0.85
     naturally
    0.78
     no
    0.78
     maybe
    0.74
     why
    0.72
     perhaps
    0.72
    Act Density 0.071%

    No Known Activations