INDEX
Explanations
references to the name "Lin" with varying degrees of specificity
references to a specific individual named Lin
New Auto-Interp
Negative Logits
Seym
-1.02
ij士
-0.81
sburgh
-0.76
therap
-0.72
GROUND
-0.71
theless
-0.70
lain
-0.67
ledged
-0.67
FUL
-0.66
EStream
-0.65
POSITIVE LOGITS
eman
0.95
ergy
0.94
emate
0.92
ergic
0.91
otype
0.90
sey
0.88
nea
0.86
jiang
0.85
olen
0.85
iets
0.83
Activations Density 0.011%