INDEX
    Explanations

    harassment, intensity

    New Auto-Interp
    Negative Logits
     harassing
    -0.62
     harass
    -0.61
     harassment
    -0.52
     harassed
    -0.50
    vania
    -0.47
    izability
    -0.47
    pread
    -0.45
    Rows
    -0.44
    fixing
    -0.43
    fixes
    -0.43
    POSITIVE LOGITS
    AndEndTag
    0.75
     CreateTagHelper
    0.74
     JpaRepository
    0.70
    WriteTagHelper
    0.65
    setopt
    0.64
    SourceChecksum
    0.64
    EndContext
    0.62
    apunov
    0.62
     Roskov
    0.61
     autorytatywna
    0.59
    Act Density 0.106%

    No Known Activations