INDEX
    Explanations

    adverbs that indicate effective or proper performance in actions

    New Auto-Interp
    Negative Logits
    まった
    -0.59
     coper
    -0.55
     substitution
    -0.54
    Substitution
    -0.54
     inspira
    -0.52
     gekomen
    -0.52
     animés
    -0.51
     perbaikan
    -0.51
     grö
    -0.51
     as
    -0.51
    POSITIVE LOGITS
    )";
    
    0.85
    correctly
    0.83
    cibly
    0.83
    denly
    0.82
    ]));
    
    0.81
    edly
    0.80
     safely
    0.79
    ]),
    
    0.79
    ligently
    0.78
    oughly
    0.77
    Act Density 0.381%

    No Known Activations