INDEX
Explanations
phrases related to the effectiveness or impact of actions or interventions
phrases related to the effectiveness or impact of actions
New Auto-Interp
Negative Logits
ibrary
-0.65
ynski
-0.64
skip
-0.59
oston
-0.59
ricting
-0.58
Drops
-0.57
roid
-0.56
boards
-0.54
hindsight
-0.54
Venture
-0.54
POSITIVE LOGITS
wonders
1.04
harm
0.97
justice
0.85
damage
0.83
lasting
0.79
trick
0.79
untold
0.78
miracles
0.77
favors
0.74
duty
0.73
Activations Density 0.130%