INDEX
    Explanations

    talking, chatting, or working

    New Auto-Interp
    Negative Logits
    Perf
    0.42
    Aging
    0.37
    éléments
    0.37
    Performance
    0.36
    Rec
    0.36
    attempts
    0.35
     कहने
    0.35
    Bel
    0.34
    ající
    0.34
    śni
    0.34
    POSITIVE LOGITS
     furiously
    0.69
     diligently
    0.65
     openly
    0.64
     out
    0.63
     directly
    0.61
     intently
    0.61
     tirelessly
    0.60
     quietly
    0.59
     frantically
    0.59
     freely
    0.59
    Act Density 0.473%

    No Known Activations