INDEX
Explanations
references to a specific subject or topic within a text
New Auto-Interp
Negative Logits
✨:
-0.81
SPS
-0.81
himſelf
-0.76
craw
-0.75
GenerationType
-0.73
dandy
-0.72
faſt
-0.72
Efq
-0.71
ForRow
-0.70
😭😭
-0.70
POSITIVE LOGITS
subject
2.23
subject
2.05
Subject
2.05
subjects
1.97
Subject
1.93
SUBJECT
1.91
SUBJECT
1.83
Subjects
1.82
subjects
1.74
Subjects
1.72
Activations Density 0.067%