INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    க்
    1.13
    ۰
    1.12
    $
    1.11
    1.03
    ד
    0.99
     negligently
    0.98
    ،
    0.97
    רי
    0.94
    	
    0.91
    פ
    0.90
    POSITIVE LOGITS
    r
    1.64
    g
    1.59
    is
    1.41
     Starts
    1.24
    as
    1.23
    to
    1.23
    the
    1.23
    ta
    1.22
    ar
    1.21
    in
    1.20
    Act Density 0.381%

    No Known Activations