INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    															
    -1.16
    老头
    -1.14
    悩む
    -1.12
     amables
    -1.09
    																			
    -1.07
     şimdi
    -1.05
     ermöglich
    -1.04
    														
    -1.02
    											
    -1.00
    uzes
    -0.99
    POSITIVE LOGITS
     from
    2.80
     pulang
    1.59
    Returned
    1.50
     Returned
    1.49
     return
    1.46
     returned
    1.44
     Returning
    1.40
     after
    1.40
     returning
    1.35
    回来
    1.34
    Act Density 0.020%

    No Known Activations