INDEX
Explanations
sentences that express strong emotional reactions or insights
New Auto-Interp
Negative Logits
elder
-0.15
osate
-0.15
inkel
-0.14
undry
-0.14
leet
-0.14
duct
-0.14
æıIJ示
-0.14
igue
-0.14
iguous
-0.14
olia
-0.13
POSITIVE LOGITS
tracks
0.23
opener
0.23
track
0.22
Tracks
0.20
tracks
0.19
listeners
0.19
-track
0.19
track
0.19
Listeners
0.19
Track
0.18
Activations Density 0.078%