INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     worker
    -0.79
     workers
    -0.78
     gddr
    -0.78
     itſelf
    -0.77
    theless
    -0.72
     hunters
    -0.72
     pleaſure
    -0.70
     photolibrary
    -0.69
     hikers
    -0.68
     Rollo
    -0.68
    POSITIVE LOGITS
    0.63
    ArrowToggle
    0.58
    DebuggerNonUser
    0.56
     beginnetje
    0.53
    addCriterion
    0.51
     autorytatywna
    0.51
    fortawesome
    0.48
    out
    0.47
    endo
    0.47
     table
    0.46
    Act Density 0.132%

    No Known Activations