INDEX
Explanations
references to specific poems and their authors
New Auto-Interp
Negative Logits
ctp
-0.16
ocab
-0.14
isty
-0.14
lse
-0.14
inidad
-0.14
adoo
-0.14
rtle
-0.13
jose
-0.13
ocha
-0.13
orna
-0.13
POSITIVE LOGITS
erson
0.17
utton
0.15
MP
0.15
mp
0.15
brero
0.15
ering
0.14
oria
0.14
usk
0.14
rost
0.14
upon
0.14
Activations Density 0.077%