INDEX
    Explanations

    references to the speaker or addressee in a personal, conversational tone (first- and second-person address)

    New Auto-Interp
    Negative Logits
     malfunctions
    0.35
     ablation
    0.33
     isotropy
    0.32
     metabolism
    0.31
     appliances
    0.30
     malfunction
    0.30
     rotting
    0.30
     denaturation
    0.30
     servings
    0.30
     ਆਪਣ
    0.29
    POSITIVE LOGITS
     recommend
    0.50
     suggest
    0.47
     sugiere
    0.47
     recommande
    0.46
     recommends
    0.45
    建議
    0.45
     empfehlen
    0.45
     recommending
    0.45
    建议
    0.44
     suggested
    0.42
    Act Density 0.288%

    No Known Activations