INDEX
    Explanations

    adding or additional content

    New Auto-Interp
    Negative Logits
     prevailed
    0.80
     unanimously
    0.80
    жан
    0.80
     პირ
    0.78
     overwhelmingly
    0.78
    انت
    0.78
     relentlessly
    0.77
     تنت
    0.76
     ero
    0.75
     bombarded
    0.75
    POSITIVE LOGITS
     tambahan
    2.84
     additional
    2.75
    additional
    2.71
     zusätzlichen
    2.64
     дополнительные
    2.59
     zusätz
    2.58
     supplémentaire
    2.57
    额外的
    2.56
     zusätzliche
    2.56
     adicional
    2.55
    Act Density 1.615%

    No Known Activations