INDEX
    Explanations

    impressive verbs ending in ive

    New Auto-Interp
    Negative Logits
    1.70
    रा
    1.59
    1
    1.57
    1.57
    (
    1.46
    ため
    1.45
    ها
    1.39
    1.39
    ы
    1.39
    ς
    1.36
    POSITIVE LOGITS
    el
    1.52
    0.98
     legy
    0.98
     všet
    0.96
    0.96
     desenvolv
    0.95
     ofthe
    0.94
     ويت
    0.94
    a
    0.94
    0.94
    Act Density 0.025%

    No Known Activations