INDEX
Explanations
verbs and actions related to installations and preparations
New Auto-Interp
Negative Logits
were
-0.75
have
-0.58
are
-0.54
weren
-0.54
byli
-0.53
read
-0.53
were
-0.53
אלה
-0.51
الذين
-0.50
explore
-0.50
POSITIVE LOGITS
wears
1.13
sends
1.07
celebrates
1.06
reaches
1.05
prepares
1.05
marries
1.05
connects
1.05
earns
1.04
learns
1.02
goes
1.02
Activations Density 1.053%