INDEX
Explanations
instances of the word 'Angel' with a strong focus on specificity
mentions of "Angel" and related entities or characters
New Auto-Interp
Negative Logits
rences
-0.80
llah
-0.74
yip
-0.70
elig
-0.68
independents
-0.67
cloth
-0.66
olicy
-0.65
ãģ¦
-0.65
merce
-0.64
theless
-0.64
POSITIVE LOGITS
enos
1.38
eno
1.19
ique
1.06
ina
1.00
ica
0.97
icals
0.96
inian
0.94
ient
0.92
ista
0.92
ic
0.92
Activations Density 0.030%