INDEX
Explanations
references to the word "who."
who is or did what
New Auto-Interp
Negative Logits
FES
-0.47
Ext
-0.45
Fes
-0.42
carefully
-0.41
WithFormat
-0.40
eway
-0.40
a
-0.39
嘉
-0.39
careful
-0.39
突
-0.38
POSITIVE LOGITS
hvem
1.01
Who
0.88
siapa
0.87
quién
0.85
Siapa
0.82
Hvem
0.82
Who
0.80
Quién
0.79
Quiénes
0.79
誰が
0.75
Activations Density 0.009%