INDEX
Explanations
phrases indicating readiness or preparedness for action
New Auto-Interp
Negative Logits
stay
-0.17
Ù쨧ÙĤ
-0.17
عاÙĨ
-0.16
бÑĥÑĤ
-0.15
stays
-0.15
äter
-0.15
styleType
-0.15
/Typography
-0.15
ité
-0.14
pra
-0.14
POSITIVE LOGITS
tackle
0.21
rock
0.21
action
0.20
accept
0.20
begin
0.19
tackling
0.19
go
0.18
rum
0.18
tackled
0.18
Go
0.17
Activations Density 0.052%