INDEX
    Explanations

    instructions or commands

    New Auto-Interp
    Negative Logits
    capital
    0.44
    Capital
    0.42
    городе
    0.41
    Church
    0.39
    πτυ
    0.38
    qualification
    0.38
     punishment
    0.38
    significant
    0.38
    გუფი
    0.38
     circulated
    0.37
    POSITIVE LOGITS
     Env
    0.45
     budgets
    0.44
    0.43
    estries
    0.40
     env
    0.39
     Bud
    0.39
     
    0.38
     contexts
    0.38
    0.38
     Buds
    0.38
    Act Density 0.025%

    No Known Activations