INDEX
Explanations
references to food and physical actions, such as eating and moving
New Auto-Interp
Negative Logits
anism
-0.92
evidence
-0.74
peak
-0.71
grounds
-0.70
iments
-0.70
Edit
-0.68
agree
-0.68
matters
-0.67
Autom
-0.66
tm
-0.65
POSITIVE LOGITS
lot
1.37
bunch
1.31
handful
1.24
couple
1.22
few
1.16
whopping
1.14
plethora
1.12
slew
1.11
dozen
1.06
bit
1.02
Activations Density 1.464%