INDEX
    Explanations

    works well or resonates most

    New Auto-Interp
    Negative Logits
    m
    0.48
    bon
    0.47
    '
    0.47
    ne
    0.46
    change
    0.46
    h
    0.46
    0
    0.46
    0.44
    to
    0.43
    PORT
    0.43
    POSITIVE LOGITS
     encanta
    0.54
    やすい
    0.51
     emphatically
    0.51
     dearly
    0.50
     legjob
    0.50
     fortemente
    0.50
     banget
    0.50
     బాగా
    0.50
     hyvin
    0.49
     perfectamente
    0.48
    Act Density 0.111%

    No Known Activations