INDEX
Explanations
references to the act of hearing about something
New Auto-Interp
Negative Logits
sis
-0.76
isoft
-0.74
ila
-0.74
igsaw
-0.73
eros
-0.71
iru
-0.69
istration
-0.69
arcity
-0.68
gradient
-0.67
redo
-0.65
POSITIVE LOGITS
voices
1.25
Voices
0.98
firsthand
0.96
footsteps
0.95
confessions
0.94
aloud
0.92
whispers
0.92
testimonies
0.90
loud
0.89
podcasts
0.89
Activations Density 0.485%