INDEX
Explanations
references to classic literature and notable literary figures
New Auto-Interp
Negative Logits
ivec
-0.16
troub
-0.16
Shelley
-0.15
æľĭ
-0.14
æ²ĸ
-0.14
Jet
-0.14
uments
-0.14
hea
-0.14
eus
-0.14
spacer
-0.14
POSITIVE LOGITS
Herc
0.34
Christie
0.29
Hastings
0.25
Orient
0.23
Belgian
0.20
Miss
0.19
Miss
0.19
poi
0.19
Ag
0.18
Christ
0.18
Activations Density 0.004%