INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    от
    0.89
    но
    0.88
    ກັບ
    0.88
    라고
    0.88
    más
    0.87
     rangkaian
    0.87
     веке
    0.86
     Yale
    0.85
    나는
    0.85
     Phật
    0.84
    POSITIVE LOGITS
     runner
    1.18
    🏃
    1.03
    runners
    1.00
    runner
    0.95
     homers
    0.95
    geq
    0.95
     Runner
    0.94
    	
    0.93
     runners
    0.88
     nupt
    0.88
    Act Density 0.014%

    No Known Activations