INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     точно
    -0.07
    ическая
    -0.07
    pty
    -0.06
     numb
    -0.06
     probation
    -0.06
    iamond
    -0.06
    Template
    -0.06
    Submitting
    -0.06
    acky
    -0.06
     corners
    -0.06
    POSITIVE LOGITS
    ------------↵
    0.07
    _START
    0.07
     období
    0.06
     MVP
    0.06
     Able
    0.06
    ...↵↵↵↵↵↵
    0.06
    elder
    0.06
    rix
    0.06
    :]↵↵
    0.06
    .drive
    0.06
    Act Density 0.013%

    No Known Activations