INDEX
Explanations
do followed by verbs or punctuation
New Auto-Interp
Negative Logits
ும
0.79
условия
0.71
不大
0.70
beetje
0.68
alguma
0.67
дки
0.66
лишком
0.65
whatever
0.64
м
0.64
Dock
0.63
POSITIVE LOGITS
consists
0.84
unfolds
0.81
differs
0.76
organizes
0.74
terminates
0.73
များသည်
0.73
platelets
0.73
இதனால்
0.73
inside
0.71
मधील
0.71
Activations Density 0.304%