INDEX
    Explanations

    universally agreed consensus

    New Auto-Interp
    Negative Logits
     Activation
    0.74
     activations
    0.74
     gratuita
    0.72
     gratuitamente
    0.71
    Associ
    0.69
     तपास
    0.69
    दिनी
    0.69
     activation
    0.69
     اظہار
    0.69
     Automatic
    0.68
    POSITIVE LOGITS
     compromise
    2.79
     compromises
    2.46
     consensus
    2.35
     Comprom
    2.24
    comprom
    2.23
     agreed
    2.07
     agreement
    2.04
     Consensus
    2.04
    consensus
    2.03
     agreeing
    2.00
    Act Density 0.257%

    No Known Activations