INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     immediate
    -2.59
     immediately
    -2.50
    immediately
    -2.33
     immédi
    -2.30
    immediate
    -2.30
     Immediate
    -2.25
     immédiatement
    -2.20
    Immediate
    -2.19
     Immediately
    -2.16
     instantly
    -2.08
    POSITIVE LOGITS
    ,
    0.63
    .
    0.50
     (
    0.48
      
    0.48
    0.47
     H
    0.47
    帖最后由
    0.46
     T
    0.45
    :
    0.44
    !
    0.43
    Act Density 0.016%

    No Known Activations