INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ൊന്നും
    0.74
     হ্
    0.73
     కాని
    0.67
    <unused2149>
    0.65
    <unused2130>
    0.65
    యత్
    0.62
    なくて
    0.59
    <unused2213>
    0.59
     =~
    0.59
     -*-
    0.58
    POSITIVE LOGITS
    ,
    3.62
    ،
    2.77
    !,
    2.35
    -,
    2.29
    2.27
    (),
    2.27
    ™,
    2.26
    ®,
    2.23
     ,
    2.21
    2.18
    Act Density 0.180%

    No Known Activations