INDEX
    Explanations

    entity followed by action

    New Auto-Interp
    Negative Logits
     operate
    0.64
     Are
    0.58
    nare
    0.55
     pollute
    0.53
    Are
    0.52
     gestire
    0.51
     Lied
    0.51
     Operate
    0.50
    iams
    0.50
     fale
    0.50
    POSITIVE LOGITS
     выяс
    0.50
    लेश्वर
    0.49
     prophyl
    0.49
     объяс
    0.48
    まず
    0.48
    proceed
    0.47
     addresses
    0.47
    Analog
    0.44
     proceeds
    0.44
     recalls
    0.43
    Act Density 0.198%

    No Known Activations