INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    zné
    -1.17
    -1.05
    ٶ
    -1.00
     even
    -0.95
     would
    -0.93
     चुनें
    -0.92
    licante
    -0.91
    estrutura
    -0.88
     piernas
    -0.88
    Én
    -0.86
    POSITIVE LOGITS
     apart
    0.86
    aside
    0.85
    0.83
    apart
    0.81
    too
    0.81
    among
    0.79
    hlt
    0.79
    rage
    0.79
    particularly
    0.79
    many
    0.78
    Act Density 0.086%

    No Known Activations