INDEX
Explanations
locations or objects related to physical structures
locations and settings associated with incidents or events
New Auto-Interp
Negative Logits
sexes
-0.81
Ps
-0.75
positives
-0.74
phas
-0.69
Parties
-0.66
prizes
-0.66
ernels
-0.63
feats
-0.63
tokens
-0.63
pics
-0.63
POSITIVE LOGITS
courtyard
0.83
robe
0.81
riddled
0.79
balcony
0.76
overlooking
0.76
blender
0.75
alley
0.74
café
0.73
hallway
0.73
airst
0.72
Activations Density 0.373%