INDEX
Explanations
mentions of names or terms related to politicians or political figures, specifically focusing on the term "iden"
references to specific individuals' names
New Auto-Interp
Negative Logits
exha
-0.82
thora
-0.81
Canadians
-0.70
Rita
-0.69
uncontrolled
-0.69
weeds
-0.67
challeng
-0.65
ãħĭ
-0.63
pmwiki
-0.63
Canadian
-0.62
POSITIVE LOGITS
iden
1.47
ovo
0.90
unci
0.89
ners
0.87
ning
0.86
ned
0.85
ception
0.85
roid
0.77
nen
0.76
vier
0.76
Activations Density 0.006%