INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     they
    -1.90
    -1.51
    -1.48
     there
    -1.45
     hän
    -1.44
     tendrás
    -1.38
    -1.36
     sinistro
    -1.35
    -1.34
     iſt
    -1.32
    POSITIVE LOGITS
    .”
    1.69
     –
    1.63
     той
    1.47
    一個
    1.46
     móds
    1.45
    !”
    1.45
    attività
    1.44
    ):
    1.41
    ;
    
    1.41
    activités
    1.40
    Act Density 0.028%

    No Known Activations