INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    建議
    0.44
     suggested
    0.40
     الخامسه
    0.38
     Pes
    0.37
     suggesting
    0.36
     Suggested
    0.36
    推荐
    0.35
     Toler
    0.35
     Wesleyan
    0.35
     suggest
    0.35
    POSITIVE LOGITS
    _
    0.50
     hallucinations
    0.39
     antibiotics
    0.38
     curtains
    0.38
    ،
    0.38
    Bits
    0.38
     axons
    0.38
    pa
    0.37
    0.37
    ANA
    0.37
    Act Density 0.463%

    No Known Activations