INDEX
Explanations
instances where the document discusses making a difference or an impact
phrases indicating significance or impact
New Auto-Interp
Negative Logits
diligently
-0.67
ħĭ
-0.62
IDS
-0.57
lass
-0.57
manually
-0.56
mathemat
-0.56
Strategies
-0.55
gat
-0.54
razen
-0.53
battled
-0.52
POSITIVE LOGITS
sense
0.81
sense
0.76
difference
0.75
wearer
0.72
Difference
0.71
mockery
0.68
impression
0.67
jar
0.67
noticeable
0.66
shudder
0.65
Activations Density 0.188%