INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Н
    0.50
    Ν
    0.44
    В
    0.42
    Про
    0.41
    𝓻
    0.40
    Ба
    0.40
    П
    0.39
    О
    0.39
    Но
    0.37
    0.37
    POSITIVE LOGITS
     again
    0.57
     opět
    0.53
     novamente
    0.48
     similarly
    0.48
     Again
    0.48
     likewise
    0.48
    同樣
    0.47
     unlike
    0.47
    again
    0.46
    찬가지
    0.45
    Act Density 0.312%

    No Known Activations