INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    Things
    0.52
    Any
    0.52
    Anything
    0.52
    Others
    0.52
    任何人
    0.50
    )-
    0.50
    Anyone
    0.49
     Others
    0.49
    things
    0.48
    غير
    0.47
    POSITIVE LOGITS
     a
    0.98
     three
    0.94
     an
    0.94
     two
    0.88
     four
    0.84
     several
    0.81
     five
    0.77
    了一个
    0.76
     μια
    0.75
     ένα
    0.74
    Act Density 4.788%

    No Known Activations