INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -
    0.52
    IMF
    0.44
    стя
    0.41
     totalitarian
    0.41
     gusta
    0.40
     завжди
    0.40
     bebé
    0.40
    0.40
    ็อก
    0.40
    頑張
    0.39
    POSITIVE LOGITS
     इतर
    0.46
     其他
    0.43
     enzymes
    0.43
    0.43
     advisor
    0.42
     Opens
    0.41
     parmesan
    0.40
     oars
    0.40
     other
    0.40
    0.39
    Act Density 0.013%

    No Known Activations