INDEX
Explanations
phrases related to the impact and effects of various policies and actions
New Auto-Interp
Negative Logits
swick
-0.80
mint
-0.66
staples
-0.64
mberg
-0.64
ç«
-0.63
zar
-0.62
vision
-0.61
croft
-0.60
åī
-0.60
otide
-0.59
POSITIVE LOGITS
effects
1.24
iveness
1.04
impact
0.97
impacts
0.94
bringer
0.94
Effects
0.91
ripple
0.90
effects
0.89
ful
0.88
effect
0.88
Activations Density 2.344%