INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    对待
    0.45
     augustus
    0.45
    ainak
    0.44
    Tabpage
    0.44
    ivé
    0.44
     μεγαλ
    0.43
     تھے
    0.43
     zostało
    0.43
     estavam
    0.42
     منظور
    0.42
    POSITIVE LOGITS
    akrishnan
    0.47
    なぜ
    0.46
     VPN
    0.43
    ញ្
    0.42
    تا
    0.41
     bumpy
    0.41
    0.40
     bosons
    0.39
    0.39
    0.39
    Act Density 0.003%

    No Known Activations