INDEX
Explanations
references to social interactions in cafés and parks
New Auto-Interp
Negative Logits
ulton
-0.18
dinner
-0.17
Dinner
-0.16
Cook
-0.16
cooking
-0.16
Cooking
-0.16
shower
-0.15
999
-0.15
ledge
-0.15
cook
-0.15
POSITIVE LOGITS
coffee
0.23
caffe
0.23
cafe
0.22
Coffee
0.21
Starbucks
0.21
café
0.21
Coffee
0.20
caffeine
0.19
coff
0.19
ì»
0.18
Activations Density 0.099%