INDEX
Explanations
mentions of a specific person's name "Huntsman"
the presence of specific names or notable individuals
New Auto-Interp
Negative Logits
aults
-0.81
iott
-0.81
ually
-0.77
iate
-0.74
ously
-0.73
iance
-0.70
ingo
-0.70
iary
-0.69
igans
-0.68
olves
-0.68
POSITIVE LOGITS
ppo
0.94
rament
0.88
bec
0.81
bara
0.79
ptive
0.78
bs
0.77
Tex
0.76
bus
0.76
ãĥĨ
0.75
brate
0.75
Activations Density 0.070%