INDEX
    Explanations

    party, whole, assignment, Fed

    New Auto-Interp
    Negative Logits
    an
    1.03
    0.94
    se
    0.93
    К
    0.91
     I
    0.89
    Мо
    0.87
    ed
    0.87
    entric
    0.84
    z
    0.83
    0.82
    POSITIVE LOGITS
     tassels
    0.85
    тились
    0.84
     quilts
    0.84
     cucumbers
    0.83
     chestnuts
    0.83
     screws
    0.83
     giraffe
    0.82
     twigs
    0.80
     acorns
    0.79
    ("^
    0.79
    Act Density 0.001%

    No Known Activations