INDEX
    Explanations

    forget quick, vague, loud

    New Auto-Interp
    Negative Logits
     VERY
    0.44
    )[
    0.39
    0.39
     non
    0.38
     both
    0.38
    Non
    0.38
     nontrivial
    0.36
    Очень
    0.36
     approximately
    0.36
     very
    0.36
    POSITIVE LOGITS
     tradicionales
    0.80
    传统的
    0.79
     tradicional
    0.79
     conventional
    0.76
     tradicion
    0.73
    傳統
    0.71
     flashy
    0.70
     tradizionale
    0.70
     traditionnel
    0.70
    従来の
    0.70
    Act Density 0.120%

    No Known Activations