INDEX
Explanations
interactions and activities involving groups of people
New Auto-Interp
Negative Logits
odi
-0.18
azio
-0.17
779
-0.15
.screen
-0.14
illard
-0.14
nal
-0.14
instance
-0.14
Dat
-0.14
776
-0.14
_dat
-0.13
POSITIVE LOGITS
fa
0.18
shelter
0.16
pot
0.16
oppel
0.16
Eventually
0.16
leg
0.15
eventually
0.15
tog
0.15
spect
0.15
moo
0.15
Activations Density 0.053%