INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    就會
    0.38
    roughly
    0.38
    whatever
    0.36
    GLOBALS
    0.34
     irgendwo
    0.34
     कहला
    0.34
     مثلا
    0.34
    SqlServer
    0.33
     במהלך
    0.33
     ciertos
    0.32
    POSITIVE LOGITS
     added
    0.67
     properly
    0.66
     explicit
    0.66
     explicitly
    0.61
     Added
    0.61
     included
    0.59
     correctly
    0.57
    ちゃんと
    0.57
     correttamente
    0.55
     правильно
    0.54
    Act Density 0.081%

    No Known Activations