INDEX
    Explanations

    words ending in -ive, -ation, -able, -ful

    New Auto-Interp
    Negative Logits
    之間
    0.42
    之间
    0.40
     caminho
    0.40
    麻烦
    0.40
    ‌ی
    0.39
     नाइट्रेट
    0.39
     пре
    0.39
     थकान
    0.38
     πολύ
    0.38
    ército
    0.38
    POSITIVE LOGITS
    ative
    0.99
    ized
    0.94
    ation
    0.89
    ed
    0.87
    able
    0.85
    istic
    0.84
    ified
    0.82
    ment
    0.81
    ful
    0.80
    ity
    0.80
    Act Density 0.838%

    No Known Activations