INDEX
Explanations
pronouns and determiners followed by verbs expressing interaction or association with specific individuals or groups
phrases referring to relationships and connections with people or groups
New Auto-Interp
Negative Logits
puter
-0.68
trap
-0.67
pez
-0.63
artifacts
-0.62
PIN
-0.60
Patt
-0.59
adobe
-0.59
stage
-0.57
Tracker
-0.57
äºĶ
-0.57
POSITIVE LOGITS
soever
0.88
usalem
0.75
omes
0.64
illas
0.63
atar
0.63
xual
0.62
omever
0.62
many
0.62
sey
0.62
igor
0.60
Activations Density 0.031%