INDEX
Explanations
proper nouns and names related to people or places
mentions of names or terms associated with Claude and related concepts
New Auto-Interp
Negative Logits
iance
-0.84
enegger
-0.77
visors
-0.76
mable
-0.71
bucks
-0.68
iem
-0.68
burse
-0.67
ailable
-0.67
oppable
-0.66
tops
-0.66
POSITIVE LOGITS
apeake
0.81
Minutes
0.76
ements
0.75
ignty
0.70
illon
0.70
illac
0.70
thia
0.69
uty
0.68
iton
0.68
odies
0.67
Activations Density 0.041%