INDEX
Explanations
references to caves
references to caves
New Auto-Interp
Negative Logits
pring
-0.77
abeth
-0.73
icious
-0.69
ACTIONS
-0.69
oppable
-0.69
imated
-0.65
oice
-0.64
ICS
-0.63
orporated
-0.63
ulz
-0.61
POSITIVE LOGITS
cave
1.08
caves
1.01
Dwell
0.91
paintings
0.90
entrances
0.83
canyon
0.77
yrinth
0.75
tto
0.73
lings
0.72
door
0.72
Activations Density 0.031%