INDEX
Explanations
mentions of the name "Arthur"
mentions of the name "Arthur."
New Auto-Interp
Negative Logits
ongyang
-0.84
href
-0.73
initialized
-0.72
ple
-0.71
ramid
-0.70
plings
-0.68
arity
-0.67
racted
-0.67
FER
-0.67
ering
-0.66
POSITIVE LOGITS
Ashe
1.08
Weasley
0.96
Conan
0.96
Pend
0.91
andise
0.89
Arthur
0.81
Dent
0.80
itic
0.78
ufact
0.76
ian
0.76
Activations Density 0.025%