Skip to content

Handmade Software Foundation Memberships#39

Draft
reece365 wants to merge 15 commits into
masterfrom
hsf-memberships
Draft

Handmade Software Foundation Memberships#39
reece365 wants to merge 15 commits into
masterfrom
hsf-memberships

Conversation

@reece365
Copy link
Copy Markdown
Collaborator

@reece365 reece365 commented Jun 2, 2026

Implementation of a new system that allows users to initiate and manage a membership with the Handmade software foundation, managing recurring payments, notification of users, logging, benefits and more. Fundamentally, the system is comprised of series of Stripe driven state machines, directing users to hosted payment pages, handling webhooks returned by Stripe, notifying users as defined by the state via email, and applying benefits as appropriate.

Though the core payment flow is simple at heart, (only requiring handling a few webhooks and minimal UI changes), complexity grows exponentially upon the integration of asynchronous payment methods such as ACH; rich email notifications; change of payment method; grace periods; cancellation; and bank account verification.

Furthermore, the asynchronous and, by it's nature, disorderly state of Stripe's webhooks results in much of the complexity found in this PR, requiring idempotency guards, event ledgers, and nuanced handlers to ensure that incoming data from Stripe is handled in the most Handmade way possible, mitigating the risk of double-triggered and disordered webhooks.

Webhooks needed to be dispatched between handlers for ticket sales and memberships, as there was significant overlap in webhook types which needed processing between the two systems. This requited the development of a combined router/handler which "dispatches" out events to either ticket or membership handlers based on priceID.

The system supports multiple currencies, (at this moment, EUR and USD); future support for additional tiers; management of Discord roles as a benefit of membership, (with the ability to manually sync roles); a customizable "grace period" for highly asynchronous payment methods like ACH; and the logging of user membership info for future analytics.

Additionally, the system includes a rich testing and debugging suite, (admin membership), for easy analysis of the complex state machines which make up this system. The included tests are fully end to end, simulating actual Stripe webhooks as user actions, creating test customers, payment methods, and checkout sessions, then ensuring the correct database state. Further, these tests include the Mailpit test SMTP server to verify the correct notification emails are being sent at the appropriate point in a given scenario.

This PR contains a complex system which spans many parts of hmn. Due to the project requirements, the features needed, and the complex and asynchronous nature of real-world payment handling, the system footprint is larger than expected. Despite this, you will find these edits to be made with good reason, to address the many possible edge cases that may be found when interacting with numerous payment types, currencies, and configurations.

The system has been broken up into components wherever possible to improve the "Handmade-like" serviceability and readability, and conventions have been followed from elsewhere in the project.

@reece365 reece365 requested review from AsafGartner and bvisness June 2, 2026 10:32
@reece365 reece365 closed this Jun 2, 2026
@reece365 reece365 reopened this Jun 2, 2026
Copy link
Copy Markdown
Member

@bvisness bvisness left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good overall. I have to go to bed but will resume review tomorrow.

}

func (m AddDetailedSubscriptionInfo) Name() string {
return "AddDetailedSubscripitionInfo"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would very much like to know how a typo snuck in here but not into the type name :P

(it does not matter one bit, but it is funny)


func (m AddDetailedSubscriptionInfo) Up(ctx context.Context, tx pgx.Tx) error {
_, err := tx.Exec(ctx, `
ALTER TABLE hmn_user
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit odd to have so many migrations that all modify the user table. Migration files kind of serve double duty as modifying the schema and serving as reference for the database schema without having to pull up the Postgres CLI and run \d.

Basically what I am saying is that it would be convenient to roll all these migrations together into one big migration, so that you can see all the fields at once. (This also would have the advantage of reordering the columns.)


CREATE TABLE user_payment (
id SERIAL PRIMARY KEY,
user_id INTEGER NOT NULL REFERENCES hmn_user(id) ON DELETE CASCADE,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably prefer ON DELETE RESTRICT here, so that you can't delete the user record until you've cleaned up the membership. That way we would be reminded to go figure things out in the Stripe dashboard and make sure we've cleaned up any subscriptions.

func (m AddThankYouEmailSent) Up(ctx context.Context, tx pgx.Tx) error {
_, err := tx.Exec(ctx, `
ALTER TABLE hmn_user
ADD COLUMN thank_you_email_sent BOOLEAN NOT NULL DEFAULT false;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

membership_thank_you_sent

_, err := tx.Exec(ctx,
`
ALTER TABLE hmn_user
ADD COLUMN is_subscribed BOOLEAN NOT NULL DEFAULT false;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too generic a name imo; could easily apply to newletters and things. Let's make sure membership appears in all these new column names, e.g. has_membership, membership_stripe_customer_id, etc.

Comment thread src/website/stripe.go
Comment thread src/website/stripe.go
Comment thread src/website/stripe.go
if _, ok := seen[id]; ok {
return
}
seen[id] = struct{}{}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this trying to deduplicate price IDs? Why? It's in config; we will literally never put in duplicates.

Comment thread src/website/stripe.go
Comment on lines +369 to +372
membership := map[string]struct{}{}
for _, id := range membershipWebhookPriceIDs() {
membership[id] = struct{}{}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And now building another hash map out of the slice of price IDs we just constructed while constructing a hash map of price IDs to deduplicate them! This is all very silly and totally unnecessary. membershipWebhookPriceIDs should just return a slice of all price ids from config, assuming no duplicates, and then you should just loop through that slice here, no hash map. It will almost certainly be faster than allocating a hash map and checking the hash map for a string.

Better yet, why is the value in config.go not itself just a slice of price IDs? I'm not sure there's a good reason to have one "primary" one and a list of "alternates", when you could just have a list of all price IDs, and we default to the first one in the list. Then you wouldn't need membershipWebhookPriceIDs at all; you would just loop over the list from config.

Comment thread src/website/stripe.go
return stripeWebhookKindUnknown, oops.New(err, "failed to load ticket price ids")
}

known := make(map[string]struct{}, len(ticketPriceIDs))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, the hash map is superfluous. Consider what you have to do to build the hash map and then look up an item in it:

  • Allocate the backing storage.
  • Loop through ticketPriceIDs, hashing each string (which means iterating every byte in the string) and then inserting into the backing storage (resolving collisions as necessary).
  • Loop through priceIDs, and for each price ID, hash the string (iterating every byte) and look it up in the hash map:
    • If the hash matched an item, do a full priceID == knownID string comparison because hashes can collide. If they match, return stripeWebhookKindTicket, otherwise, move onto the next item in the hash map and run this step again, or continue to the next price ID because there was no match.
  • (later) Garbage collect the backing storage.

Whereas if you you simply do a double for loop over ticketPriceIDs and priceIDs, you will just do m * n string comparisons and zero memory allocations, and that's that. But len(ticketPriceIDs) is currently 1, and in a few years it might be as high as 5, and priceIDs comes from the webhook event and I can't even imagine what would cause it to be greater than 1. At most I figure it could be 2, because that's the number of prices in our config and on the subscription.

I know it seems like the "right" thing to do is to avoid the double for loop, because n^2 algorithms are bad etc. But we are currently looking at a maximum of two string comparisons, and in the future maybe it would be like 10, and probably in many cases it's still less than that. This is nothing. The right thing to do is the stupid thing, which will nonetheless complete in much less than a millisecond and totally avoid allocating garbage that could impact the rest of the application.

@reece365
Copy link
Copy Markdown
Collaborator Author

reece365 commented Jun 3, 2026

I'm also about to head to bed, hoping to tackle some more of those prickly database structure and migration questions in the morning. I appreciate you going so in-depth into reviewing everything @bvisness. I think at some points I've been overly defensive in the code, creating checks for non-existent, (or very esoteric), cases and such. Most of the time this is just based on the mindset of, "oh well why not just check", instead of thinking of how something like a blank customer ID might actually occur. All of that's to say, I appreciate the very real lesson in engineering pragmaticism

Copy link
Copy Markdown
Member

@bvisness bvisness left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish I had more time to review this! I will keep chipping away at it when I can, partially while I'm traveling tomorrow.

As I start to work through more of the core Stripe logic here, I am finding that it's a bit messy in an unsurprising sort of way. It is basically a bit overwhelming to read 10 files with a dozen free-floating functions each, and to have no big-picture idea of how they connect and what order things occur in.

I think we will need to just keep chipping away at this and refining it until it makes more sense, but a general thing I think would be valuable right now is to just try and document the big picture of what events we expect from Stripe for each of the major flows, and what files/functions are responsible for handling each of them. Then, step back from that and see if you can reorganize the code around that summary. I suspect you will find a lot of small simplifications.

Realistically, we won't be able to launch this at the Expo, but I'm not too concerned about that. We'll see if we can at least get it to a point where I could show it off, then launch it in the following weeks.

Comment on lines +29 to +34
if override := config.Config.Stripe.SubscriptionNowOverride; override != "" {
if parsed, err := time.Parse(time.RFC3339, override); err == nil {
return parsed
}
}
return time.Now()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a bit strange; why do we need both a public method in here and a value we parse from config? And why is the value in config a string anyway instead of a time.Time value?

Comment on lines +82 to +99
func isHardDeclineErrorCode(code string) bool {
switch stripe.ErrorCode(code) {
case stripe.ErrorCodeCardDeclined,
stripe.ErrorCodeInsufficientFunds,
stripe.ErrorCodeExpiredCard,
stripe.ErrorCodeIncorrectCVC,
stripe.ErrorCodeIncorrectNumber,
stripe.ErrorCodeInvalidCVC,
stripe.ErrorCodeInvalidExpiryMonth,
stripe.ErrorCodeInvalidExpiryYear,
stripe.ErrorCodeInvalidNumber,
stripe.ErrorCodeProcessingError,
stripe.ErrorCodeAuthenticationRequired:
return true
default:
return false
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are so many functions in here that I think it would be good to consolidate some things that are only called in one place. It's very hard to wrap your head around the control flow of a program when you have lots of small functions that could be called from anywhere.

So for example, because this is only called from one place, you could inline this into paymentIntentIsHardDecline like so:

func paymentIntentIsHardDecline(pi *stripe.PaymentIntent, paymentMethodType string) bool {
  hardDeclineCodes := []stripe.ErrorCode{
    stripe.ErrorCodeCardDeclined,
    stripe.ErrorCodeInsufficientFunds,
    // ...
  }

  // ... 
  if pi.LastPaymentError != nil {
    return slices.Contains(hardDeclineCodes, pi.LastPaymentError.Code)
  }
  // ...
}

Sometimes I'll even just do nested functions for something that is only of use in one place, just to make it clear that it's only useful in one place. As a general principle, I actually prefer bigger, longer functions instead of lots of tiny functions when the overall flow of the logic is complex and there is only one path through it, because as a reader I can have confidence that the thing I am reading has a purpose that I can find in the same function and nowhere else.

}
}

func paymentIntentPaymentMethodType(ctx context.Context, sc *stripe.Client, pi *stripe.PaymentIntent) string {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find myself wishing for some comments throughout these Stripe-specific files that explain some of the intent behind each function.


func isAsyncPaymentMethodType(pmType string) bool {
switch pmType {
case "us_bank_account", "acss_debit", "sepa_debit":
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are acss_debit and sepa_debit European bank types or something? I would just remove them if we don't support those right now.

Comment thread src/website/stripe.go
return c.JSONErrorResponse(http.StatusBadRequest, oops.New(err, "bad JSON in stripe webhook"))
sc := stripe.NewClient(config.Config.Stripe.SecretKey)

if isMembershipGracePaymentRetryEvent(&event) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it a bit strange that we have these two special cases for membership-related events before we get to the switch below. I'm guessing it's because the switch classifies which "system" the event belongs to by price ID, but these types of event don't have a price ID?

Personally I think it would make more sense if you broadened the "classifier" to classify the event generally, not just the price ID. I imagine it would look something like:

// Figures out which high-level "system" a Stripe event belongs to (e.g. event tickets vs. memberships)
func classifyStripeEvent(event *stripe.Event, priceIDs []string) stripeWebhookKind {
	// We know some events belong to memberships even if we have no price ID
	switch event.Type {
	case "payment_method.attached", "customer.updated", // membership payment retry
		"payment_intent.processing", "payment_intent.requires_action", "payment_intent.payment_failed": // membership payment intent
		return stripeWebhookKindMembership
	}

	// Otherwise, compare the price IDs against those in config

	membershipPriceIDs := append(config.Config.Stripe.PriceID, config.Config.Stripe.MembershipAlternatePriceIDs...)
	ticketPriceIDs, err := db.QueryScalar[string](ctx, conn, `
		SELECT stripe_price_id
		FROM ticket_metadata
		WHERE stripe_price_id <> ''
	`)
	if err != nil {
		return stripeWebhookKindUnknown, oops.New(err, "failed to load ticket price IDs")
	}

	for _, id := range priceIDs {
		if slices.Contains(membershipPriceIDs, id) {
			return stripeWebhookKindMembership, nil
		}
		if slices.Contains(ticketPriceIDs, id) {
			return stripeWebhookKindTicket, nil
		}
	}

	return stripeWebhookKindUnknown, nil
}

Then you could make the membership retry / payment intent handling a normal part of the flow inside handleMembershipStripeEvent.

@reece365
Copy link
Copy Markdown
Collaborator Author

reece365 commented Jun 4, 2026

I appreciate all your feedback so far, and it looks like I have a bit of work cut out for me in just documenting much of the Stripe control flow and logic. I'll get started on some more verbose comments and documentation so it's easier to understand the point of each of these functions. With that said, while I work on that, do you feel like certain changes are more pressing in terms of getting the codebase into a state able to be demoed on Saturday? Is that something you feel is still reasonable while we chip away at some of these issues before a full launch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants