INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Baldwin
    -0.06
    party
    -0.06
     hải
    -0.06
     đĩa
    -0.06
    isted
    -0.06
    -destruct
    -0.06
     sistemas
    -0.06
    ‌آ
    -0.06
    άνα
    -0.06
    ien
    -0.06
    POSITIVE LOGITS
     urging
    0.11
     urge
    0.10
     urgent
    0.10
     Urg
    0.08
     urges
    0.08
     imperative
    0.08
     urged
    0.07
    utility
    0.07
    ;;↵↵
    0.07
    .Commands
    0.07
    Act Density 0.004%

    No Known Activations