INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     거고
    0.42
    0.40
     dalamnya
    0.38
    0.38
    ใช
    0.38
    اعه
    0.37
    ÔNG
    0.37
     여기서
    0.37
     thereon
    0.36
    بانی
    0.36
    POSITIVE LOGITS
    ($
    0.41
     Carter
    0.41
     непосредственно
    0.41
    0.41
    并没有
    0.39
     Keller
    0.38
    ގެ
    0.38
    (@
    0.37
    Nation
    0.37
     directly
    0.37
    Act Density 0.118%

    No Known Activations