INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     지정
    0.94
    anglement
    0.92
     nagu
    0.90
     connecter
    0.86
    Specified
    0.84
     tarafından
    0.84
    なあ
    0.84
    أ
    0.83
     gọi
    0.81
     especificar
    0.81
    POSITIVE LOGITS
     comparison
    1.79
     Comparison
    1.60
     apples
    1.60
     compar
    1.55
     porówn
    1.54
     comparisons
    1.52
    Comparison
    1.50
     Comparisons
    1.50
     against
    1.48
     favorably
    1.46
    Act Density 0.148%

    No Known Activations