INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     impossibility
    0.62
    cannot
    0.59
    params
    0.58
     imposible
    0.58
     невозможно
    0.56
     impossible
    0.56
     impossibile
    0.55
     cannot
    0.54
     нельзя
    0.54
    不可能
    0.54
    POSITIVE LOGITS
     But
    0.47
    9
    0.46
     lire
    0.45
     keep
    0.44
     dành
    0.44
     lur
    0.42
     Omb
    0.41
    传统的
    0.41
    ü
    0.40
     fontos
    0.40
    Act Density 0.080%

    No Known Activations