INDEX
    Explanations

    percentages and numerical data

    New Auto-Interp
    Negative Logits
    engagent
    -0.50
     صوتيه
    -0.50
    Diweddarwch
    -0.48
    Viited
    -0.47
    
    -0.46
     zijne
    -0.45
     zprá
    -0.42
     للمعارف
    -0.42
     trenes
    -0.41
     novità
    -0.41
    POSITIVE LOGITS
    half
    0.76
    一半
    0.71
     half
    0.71
     Half
    0.68
    Half
    0.66
     halves
    0.62
    半分
    0.61
     mitad
    0.60
     metade
    0.59
     Hälfte
    0.59
    Act Density 0.075%

    No Known Activations