INDEX
Explanations
references to specific physical locations
references to rooms in various contexts
New Auto-Interp
Negative Logits
HER
-0.73
Uz
-0.69
Iss
-0.64
PB
-0.61
FIN
-0.61
Accessed
-0.60
RBI
-0.60
elson
-0.59
retribution
-0.58
FTA
-0.56
POSITIVE LOGITS
room
1.21
rooms
1.02
rooms
0.96
room
0.93
itory
0.87
Room
0.86
doors
0.85
upstairs
0.81
stalls
0.81
Room
0.80
Activations Density 0.030%