INDEX
Explanations
themes of relatability and personal connection in narratives
New Auto-Interp
Negative Logits
icari
-0.18
anou
-0.18
alone
-0.16
views
-0.15
adder
-0.15
etyl
-0.14
ekler
-0.14
dued
-0.14
orgeous
-0.14
Hawkins
-0.14
POSITIVE LOGITS
iras
0.16
inas
0.15
annis
0.15
aggio
0.15
olas
0.15
enger
0.14
tor
0.14
TF
0.14
alia
0.14
òi
0.14
Activations Density 0.030%