INDEX
    Explanations

    defamation libel slander smearing

    New Auto-Interp
    Negative Logits
    Optim
    0.49
     оптими
    0.46
    optim
    0.45
    索引
    0.42
     entusi
    0.41
     optim
    0.41
     আধ
    0.41
    0.40
     متجه
    0.40
     Optim
    0.40
    POSITIVE LOGITS
     slander
    1.76
     defamatory
    1.58
     defamation
    1.49
     smear
    1.47
    1.44
     smears
    1.38
     defam
    1.33
     baseless
    1.27
     libel
    1.26
    1.23
    Act Density 0.050%

    No Known Activations