INDEX
Explanations
phrases related to government and policies
New Auto-Interp
Negative Logits
ses
-0.80
ãģĨ
-0.71
1001
-0.67
pees
-0.63
atical
-0.61
iking
-0.61
olves
-0.61
fell
-0.60
ema
-0.60
sk
-0.59
POSITIVE LOGITS
romeda
1.21
furthermore
1.10
rea
1.06
therein
1.05
although
0.99
secondly
0.99
moreover
0.97
ersen
0.97
besides
0.92
yes
0.91
Activations Density 0.075%