INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     עושים
    -0.08
    𬇕
    -0.07
    BindView
    -0.07
    pow
    -0.07
    𬀩
    -0.07
     blankets
    -0.07
     railways
    -0.07
     sofas
    -0.06
    culate
    -0.06
    (sf
    -0.06
    POSITIVE LOGITS
    手游
    0.08
    조사
    0.08
    CollectionView
    0.07
    少数民族
    0.07
     presentation
    0.07
     Advent
    0.07
     Composer
    0.06
     transmitter
    0.06
    を得
    0.06
     OT
    0.06
    Act Density 0.004%

    No Known Activations