INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     proposition
    1.12
    proposition
    1.05
     hardcore
    1.01
    শক্ত
    1.01
     héros
    1.00
     жест
    0.98
    শক্তি
    0.95
     religiously
    0.94
     stereotypes
    0.94
    activated
    0.94
    POSITIVE LOGITS
     knowledge
    1.34
     знания
    1.22
    knowledge
    1.13
     advise
    1.11
    知識
    1.10
     advice
    1.09
     pengetahuan
    1.08
     disseminating
    1.07
     advising
    1.07
    artment
    1.07
    Act Density 0.067%

    No Known Activations