INDEX
Explanations
instances of the word "Hunter" with varying degrees of activation
mentions of the name "Hunter."
New Auto-Interp
Negative Logits
raints
-0.97
iment
-0.87
ription
-0.84
ngth
-0.83
itutional
-0.83
ible
-0.81
olor
-0.76
orship
-0.75
ations
-0.75
ational
-0.74
POSITIVE LOGITS
sonian
0.98
Hunter
0.89
Hunter
0.72
ãĥ£
0.72
hunter
0.71
hunter
0.71
STON
0.69
hood
0.69
stal
0.68
gon
0.67
Activations Density 0.016%