INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ہ
    0.58
    활동
    0.49
    fotos
    0.48
    Leben
    0.46
     settembre
    0.45
    هِ
    0.44
    soort
    0.44
    Thời
    0.44
    Videos
    0.44
    Hoje
    0.44
    POSITIVE LOGITS
     
    0.51
    プラン
    0.50
     encompassing
    0.47
     congratulate
    0.46
     discouraged
    0.46
     <
    0.45
     mimicking
    0.45
     b
    0.45
    indeterminate
    0.44
     लापरवाही
    0.43
    Act Density 0.001%

    No Known Activations