INDEX
Explanations
prolific speakers or sources of information in news articles
New Auto-Interp
Negative Logits
vier
-0.74
aden
-0.73
involved
-0.72
oute
-0.69
wart
-0.65
utz
-0.64
ught
-0.61
vent
-0.61
proc
-0.61
ocket
-0.61
POSITIVE LOGITS
apart
0.77
engagements
0.75
excerpts
0.73
anonymously
0.71
Speaking
0.71
fluent
0.70
frankly
0.68
unden
0.67
Pitch
0.66
Languages
0.66
Activations Density 0.027%