INDEX
    Explanations

    adverbs describing careful or sophisticated action

    New Auto-Interp
    Negative Logits
     (
    0.40
     
    0.39
    '
    0.38
    ,
    0.38
    :
    0.38
     \
    0.37
     [
    0.36
        
    0.35
    ;
    0.35
    2
    0.34
    POSITIVE LOGITS
     Многие
    0.37
    ሁሉም
    0.36
    0.35
    یشہ
    0.35
    ျေး
    0.35
     기반
    0.34
    0.34
     Всё
    0.33
    avkhat
    0.33
    iences
    0.32
    Act Density 0.036%

    No Known Activations