INDEX
Explanations
names of famous individuals
New Auto-Interp
Negative Logits
deals
-0.74
shire
-0.70
thing
-0.67
iaries
-0.66
keye
-0.66
mates
-0.65
acies
-0.65
alore
-0.65
cot
-0.64
dayName
-0.63
POSITIVE LOGITS
Perez
0.88
Stewart
0.86
Mats
0.83
Burton
0.82
Thompson
0.81
Malik
0.80
Thor
0.80
Hir
0.80
Suzuki
0.79
Hend
0.79
Activations Density 0.126%