INDEX
Explanations
names of individuals
mentions of the name "Aaron."
New Auto-Interp
Negative Logits
arget
-0.81
yip
-0.75
namese
-0.74
addons
-0.69
ById
-0.69
ribution
-0.68
buff
-0.68
nz
-0.66
lda
-0.66
ribute
-0.66
POSITIVE LOGITS
Rodgers
1.02
Burr
0.93
Hernandez
0.87
thouse
0.86
Lev
0.83
Goodman
0.81
Brooks
0.76
Finch
0.76
Fors
0.75
Yan
0.73
Activations Density 0.019%