INDEX
Explanations
phrases that indicate sourcing or referencing information
New Auto-Interp
Negative Logits
ANS
-0.92
accompan
-0.86
mire
-0.84
ans
-0.80
window
-0.78
dayName
-0.75
enth
-0.74
Cosponsors
-0.74
ourse
-0.74
anu
-0.73
POSITIVE LOGITS
occasional
1.12
maybe
0.86
those
0.83
sporadic
0.80
inconvenience
0.78
perhaps
0.75
occasionally
0.74
anecdotal
0.73
its
0.73
superficial
0.72
Activations Density 0.020%