INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    .mul
    -0.07
    tgt
    -0.06
    -0.06
    ukkit
    -0.06
    ;top
    -0.06
     моя
    -0.06
    _ins
    -0.06
     ROI
    -0.06
    -Regular
    -0.06
    /fa
    -0.06
    POSITIVE LOGITS
    	properties
    0.08
    _estimator
    0.07
    thinking
    0.07
    stalk
    0.06
    Deleting
    0.06
    verified
    0.06
     clarification
    0.06
     arguments
    0.06
     Edition
    0.06
    .classes
    0.06
    Act Density 0.004%

    No Known Activations