INDEX
Explanations
words ending in 'll featuring varying activation values from 8 to 10
words or phrases that contain the substring "ll"
New Auto-Interp
Negative Logits
EStream
-0.77
lished
-0.73
guiActiveUn
-0.73
¥µ
-0.70
cknow
-0.70
hered
-0.67
ItemTracker
-0.66
ciating
-0.66
¥ŀ
-0.66
quo
-0.65
POSITIVE LOGITS
uminati
1.19
ounge
1.15
oyd
1.15
owship
1.03
iard
1.01
sburgh
0.98
ateral
0.98
ibrary
0.96
ows
0.96
inois
0.94
Activations Density 0.027%