INDEX
Explanations
bedroom-related words and scenarios
references to bedrooms
New Auto-Interp
Negative Logits
itudes
-0.81
ename
-0.76
iful
-0.75
indal
-0.74
ific
-0.70
ifully
-0.69
andum
-0.69
rity
-0.68
ribut
-0.68
berman
-0.67
POSITIVE LOGITS
stairs
1.04
closet
0.92
Door
0.89
bedroom
0.84
upstairs
0.83
door
0.82
floor
0.80
Floor
0.78
room
0.77
door
0.75
Activations Density 0.019%