INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     certamente
    0.53
     conformément
    0.52
     potrebbero
    0.50
     enquanto
    0.50
    ڱ
    0.49
     asimismo
    0.49
     যাওয়ার
    0.48
    0.48
     mentre
    0.48
    ضي
    0.47
    POSITIVE LOGITS
     the
    0.50
    (
    0.49
     of
    0.48
    ↵↵
    0.46
     important
    0.45
     an
    0.44
     HTML
    0.44
    Play
    0.43
     Model
    0.43
    Omn
    0.43
    Act Density 0.107%

    No Known Activations