INDEX
Explanations
words or phrases related to interaction or engaging with others
occurrences of the word "interact" and related forms
New Auto-Interp
Negative Logits
ft
-0.79
prus
-0.76
statement
-0.72
ften
-0.70
nown
-0.68
collection
-0.67
ffer
-0.67
aft
-0.67
ppelin
-0.66
enthal
-0.65
POSITIVE LOGITS
ivity
1.10
uate
1.02
interactions
0.98
interacts
0.96
interact
0.94
interacted
0.94
ively
0.93
ivities
0.89
uates
0.89
interaction
0.87
Activations Density 0.012%