INDEX
Explanations
names containing the sequence "hel" along with high activation values
instances of the name "Hel" or words that commonly follow it
New Auto-Interp
Negative Logits
Rated
-0.77
-0.69
GGGG
-0.67
nomine
-0.61
EED
-0.60
NTS
-0.60
infamous
-0.59
ItemTracker
-0.59
entangled
-0.59
Flight
-0.58
POSITIVE LOGITS
tered
1.11
mand
1.02
iflower
0.98
ibrary
0.91
tering
0.90
ters
0.90
mes
0.89
iday
0.87
iths
0.87
brook
0.86
Activations Density 0.013%