INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Probably
    1.35
     probably
    1.33
    probably
    1.28
    Probably
    1.27
     akan
    1.26
     sẽ
    1.14
     probabilmente
    1.06
     probablemente
    1.06
    will
    1.06
    Will
    1.04
    POSITIVE LOGITS
     were
    2.01
     ever
    1.79
     weren
    1.78
    were
    1.76
     WERE
    1.70
     Were
    1.62
    Were
    1.58
     somehow
    1.56
     happens
    1.51
    ever
    1.47
    Act Density 0.244%

    No Known Activations