INDEX
Explanations
mentions of a specific person named Al Franken
mentions of specific political figures
New Auto-Interp
Negative Logits
OTS
-0.78
IPS
-0.76
istically
-0.76
worldly
-0.73
IRD
-0.73
ISION
-0.72
Reviewer
-0.69
ynt
-0.67
IDE
-0.67
atform
-0.67
POSITIVE LOGITS
heimer
1.10
Franken
1.04
steen
0.87
furt
0.87
ste
0.85
Ò
0.85
fur
0.84
fort
0.80
berger
0.80
thal
0.76
Activations Density 0.002%