INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Примечания
    -0.28
    -0.24
     ban
    -0.24
     f
    -0.24
     tas
    -0.23
     fit
    -0.23
    を与える
    -0.23
    -0.23
     fournies
    -0.22
    -0.22
    POSITIVE LOGITS
     queſta
    0.98
    ſicht
    0.96
    ロウィン
    0.96
    <unused79>
    0.94
    <unused41>
    0.94
    <unused28>
    0.94
    <unused14>
    0.94
    <unused23>
    0.94
    <unused8>
    0.94
    [@BOS@]
    0.94
    Act Density 0.016%

    No Known Activations