INDEX
Explanations
instances of the word 'give' or 'gave'
New Auto-Interp
Negative Logits
can
-0.65
vell
-0.62
psc
-0.59
pora
-0.59
atis
-0.58
alde
-0.58
proceeding
-0.57
ansas
-0.57
inyl
-0.57
ourge
-0.55
POSITIVE LOGITS
refunds
0.90
birth
0.87
rise
0.86
condolences
0.85
generously
0.85
aways
0.85
us
0.85
insight
0.83
advice
0.80
unconditional
0.78
Activations Density 2.990%