INDEX
Explanations
recommend or suggest
references to the speaker or addressee in a personal, conversational tone (first- and second-person address)
New Auto-Interp
Negative Logits
malfunctions
0.35
ablation
0.33
isotropy
0.32
metabolism
0.31
appliances
0.30
malfunction
0.30
rotting
0.30
denaturation
0.30
servings
0.30
ਆਪਣ
0.29
POSITIVE LOGITS
recommend
0.50
suggest
0.47
sugiere
0.47
recommande
0.46
recommends
0.45
建議
0.45
empfehlen
0.45
recommending
0.45
建议
0.44
suggested
0.42
Activations Density 0.288%