INDEX
Explanations
references to physical rooms or spaces
mentions of rooms in various contexts
New Auto-Interp
Negative Logits
Uz
-0.67
retribution
-0.65
HER
-0.64
Iss
-0.63
FTA
-0.59
PB
-0.59
FF
-0.58
Garc
-0.58
FIN
-0.58
Accessed
-0.58
POSITIVE LOGITS
room
1.17
rooms
0.95
rooms
0.93
room
0.90
Room
0.88
stairs
0.86
itory
0.85
upstairs
0.84
doors
0.82
Room
0.81
Activations Density 0.024%