INDEX
Explanations
references to specific physical spaces or rooms
references to various types of rooms
New Auto-Interp
Negative Logits
vous
-0.62
Uz
-0.61
FSA
-0.60
Iss
-0.59
SB
-0.59
Published
-0.59
³³³³³³³³
-0.59
relentless
-0.57
ocrat
-0.56
++++++++++++++++
-0.55
POSITIVE LOGITS
rooms
1.31
rooms
1.21
Rooms
1.20
pots
1.10
room
1.00
wagen
0.98
itory
0.96
chool
0.94
hops
0.93
creen
0.89
Activations Density 0.013%