INDEX
Explanations
words related to political figures, specifically those with the name "Clinton" and "Biden."
references to specific individuals, particularly their names and titles
New Auto-Interp
Negative Logits
exha
-0.81
ãħĭ
-0.77
thora
-0.76
Pwr
-0.76
pmwiki
-0.75
rul
-0.71
HAHAHAHA
-0.71
«ĺ
-0.71
ILCS
-0.70
murd
-0.70
POSITIVE LOGITS
iden
1.25
ners
0.96
ovo
0.88
vier
0.87
ned
0.85
fold
0.85
unci
0.81
ning
0.80
ception
0.78
ovic
0.77
Activations Density 0.011%