INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     un
    0.48
     Hepatitis
    0.47
     Festival
    0.46
     A
    0.46
     Maintains
    0.44
     Fisherman
    0.43
     Boats
    0.43
     first
    0.43
     chronically
    0.43
    ↵↵
    0.42
    POSITIVE LOGITS
    AND
    0.51
    I
    0.49
    0.47
    and
    0.46
    m
    0.45
    И
    0.45
    по
    0.45
    lj
    0.45
    ли
    0.44
    0.44
    Act Density 0.026%

    No Known Activations