INDEX
    Explanations

    languages, roots, and vocabulary

    New Auto-Interp
    Negative Logits
     director
    0.69
     Director
    0.68
     standpoint
    0.66
    urgical
    0.64
     ass
    0.63
    стных
    0.63
     exhaust
    0.63
    ract
    0.62
    вшейся
    0.62
     ab
    0.62
    POSITIVE LOGITS
     Balliye
    1.08
    osphère
    0.94
    kunde
    0.93
    0.92
    ונים
    0.92
    cienza
    0.90
    ון
    0.89
    alibaba
    0.89
    kový
    0.87
    čí
    0.87
    Act Density 0.098%

    No Known Activations