INDEX
    Explanations

    medicine and programming

    New Auto-Interp
    Negative Logits
     itſelf
    -0.88
     myſelf
    -0.85
    ^(@)
    -0.81
     Monfieur
    -0.78
    dafx
    -0.77
     Jefus
    -0.76
     ་་
    -0.75
     Majefty
    -0.75
    ſelves
    -0.72
     pleaſure
    -0.72
    POSITIVE LOGITS
    ↵↵
    0.58
    y
    0.54
    <eos>
    0.53
    0.51
    lose
    0.49
    幸いです
    0.48
    ↵↵↵↵
    0.47
     {
    0.46
    den
    0.46
     atle
    0.45
    Act Density 0.160%

    No Known Activations