INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ↵↵
    0.75
     of
    0.66
     McFadden
    0.64
    8
    0.60
     Billy
    0.60
    ies
    0.59
    v
    0.59
     Willie
    0.56
    EY
    0.56
     Joey
    0.55
    POSITIVE LOGITS
    0.82
     in
    0.80
    ی
    0.74
    0.72
    0.71
    ы
    0.71
    0.70
    もちろん
    0.70
    あるいは
    0.68
    又は
    0.68
    Act Density 0.001%

    No Known Activations