INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    aded
    0.78
    do
    0.77
    Lu
    0.72
    There
    0.71
    কাপ
    0.69
    ्वा
    0.69
    MCP
    0.68
    is
    0.68
    when
    0.68
    from
    0.67
    POSITIVE LOGITS
     tmp
    1.21
    <?>
    1.17
     lhs
    1.04
     _;
    1.00
     p
    0.99
     m
    0.98
     mr
    0.97
     plt
    0.96
     js
    0.96
     h
    0.93
    Act Density 0.150%

    No Known Activations