INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    in
    0.72
    است
    0.71
    0.65
    س
    0.62
    0.62
    0.60
    0.56
    М
    0.56
    گ
    0.55
    0.55
    POSITIVE LOGITS
     Brittany
    0.56
    ství
    0.54
    rabbit
    0.54
     shavings
    0.53
    ↵↵
    0.52
    ziff
    0.51
     stage
    0.50
    xlabel
    0.50
     heart
    0.50
     possíveis
    0.50
    Act Density 0.010%

    No Known Activations