INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     उबाल
    0.45
    rels
    0.44
    0.43
    imente
    0.43
    料理
    0.42
    ר
    0.42
    versive
    0.41
     कप्तान
    0.41
    ussels
    0.41
    0.41
    POSITIVE LOGITS
     Werte
    0.57
     અવ
    0.49
     നേര
    0.46
    0.45
     ተግባ
    0.45
     authorizing
    0.45
     percent
    0.43
     كانت
    0.42
     grieve
    0.42
     tarefas
    0.42
    Act Density 0.006%

    No Known Activations