INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.09
     LANGUAGE
    -0.08
    bereiche
    -0.08
     imposto
    -0.08
     imitation
    -0.08
    IMIT
    -0.08
    acteur
    -0.08
     alcançar
    -0.07
     consp
    -0.07
     peculiar
    -0.07
    POSITIVE LOGITS
     बाइक
    0.08
     велосипед
    0.08
    bike
    0.08
     सु
    0.08
     المحمول
    0.08
    710
    0.08
     Barcelona
    0.08
    lip
    0.08
    0.08
     motorcycles
    0.08
    Act Density 0.012%

    No Known Activations