INDEX
    Explanations

    entities knowing or wanting things

    New Auto-Interp
    Negative Logits
     তিনি
    0.35
     তিনিও
    0.29
     ergibt
    0.29
     Means
    0.28
     Brings
    0.28
     beliau
    0.27
    会导致
    0.27
     означает
    0.27
    導致
    0.27
    তিনি
    0.27
    POSITIVE LOGITS
     itself
    0.47
     knows
    0.37
     wants
    0.35
     believes
    0.33
     자체
    0.33
     recognizes
    0.32
     understands
    0.32
     knew
    0.32
     considers
    0.31
     thinks
    0.31
    Act Density 0.139%

    No Known Activations