INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.06
    Predicate
    -0.06
     Cher
    -0.06
     ration
    -0.06
     Taj
    -0.06
     Kol
    -0.06
    	rec
    -0.06
     pub
    -0.06
    rodu
    -0.06
    KW
    -0.06
    POSITIVE LOGITS
     withholding
    0.08
     swift
    0.07
     Swift
    0.07
    haft
    0.07
    وك
    0.07
    0.07
     خواب
    0.06
    ption
    0.06
    0.06
     swiftly
    0.06
    Act Density 0.002%

    No Known Activations