Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image uploads #576

Merged
merged 62 commits into from
Nov 6, 2023
Merged

Image uploads #576

merged 62 commits into from
Nov 6, 2023

Conversation

ekzyis
Copy link
Member

@ekzyis ekzyis commented Oct 19, 2023

To not lose my code again, here is the WIP code.

This PR adds the following button:

button to upload images
2023-10-19-194613_617x219_scrot

The button opens a file explorer. When you've selected an image, it shows up beneath the text input for you to insert into the text:

unsubmitted images show up beneath text input

2023-10-19-195255_612x264_scrot

This works by requesting a presigned AWS S3 URL on image select, immediately uploading the image on the client to S3 (similar to how avatar upload works) and then storing the URL to the image in a state array to keep track of submitted and unsubmitted image URLs.

An image URL is submitted if it's included in an item. We can check if an image URL is included in an item by looking at the Upload.itemId column of the row with Upload.id = <s3_key>. This column is populated by the worker during the imgproxy job. If it encounters a S3 URL in an item, it parses the S3 key from the URL and updates the corresponding upload row with the id of the item that the imgproxy job is currently processing:

if (url.startsWith(AWS_S3_URL)) {
  const s3Key = url.split('/').pop()
  await models.upload.update({ data: { itemId: Number(id) }, where: { id: Number(s3Key) } })
}

Upload.itemId therefore also allows us to show the stacker which images they haven't used yet. This is implemented with a new resolver User.images:

images: async (user, { submitted }, { me, models }) => {
  if (!me) return null
  const uploads = await models.upload.findMany({
    where: {
      userId: me.id,
      itemId: submitted !== undefined ? submitted ? { not: null } : null : undefined
    }
  })
  return uploads
}

To not run unnecessary database queries by adding this simply to the ME query which polls the database every second, I added a new provider ImagesProvider which fetches this once on session init.

When posting something (comment, discussion, link, ...), I used the onCompleted callback to update the state on the client.

That's the current state. Simple showcase:

2023-10-19.20-16-06.mp4

TODO:

  • remove unique item id constraint on table Upload
  • make sure index was not required for any query:
    I searched for Upload (case-sensitive) and "Upload" (since the table might be mentioned using ") and models.update but I haven't found anything of interest. So I think this table wasn't used much so far.

There can be multiple uploads (= multiple images) per item now. However, this requires a data migration since jobs currently rely on there being only exactly one upload per item via Item.uploadId

I have decided to only drop the unique index

desired schema diff
$ g d
diff --git a/prisma/schema.prisma b/prisma/schema.prisma
index d012499..a94a1c3 100644
--- a/prisma/schema.prisma
+++ b/prisma/schema.prisma
@@ -161,7 +161,7 @@ model Upload {
   size      Int
   width     Int?
   height    Int?
-  itemId    Int?     @unique(map: "Upload.itemId_unique")
+  itemId    Int?
   userId    Int
   item      Item?    @relation(fields: [itemId], references: [id])
   user      User     @relation("Uploads", fields: [userId], references: [id], onDelete: Cascade)
@@ -252,7 +252,6 @@ model Item {
   company            String?
   weightedVotes      Float                 @default(0)
   boost              Int                   @default(0)
-  uploadId           Int?
   pollCost           Int?
   paidImgLink        Boolean               @default(false)
   commentMsats       BigInt                @default(0)
@@ -284,7 +283,7 @@ model Item {
   PollOption         PollOption[]
   PollVote           PollVote[]
   ThreadSubscription ThreadSubscription[]
-  upload             Upload?
+  uploads            Upload[]
   User               User[]
   itemForwards       ItemForward[]

by the way, afaict, Upload.itemid was not used before. Possibly just a relict of a previous approach which wasn't pursued further but replaced with Item.uploadId?

  • implement fees to receive presigned URLs
  • refund fees if image is deleted (which is only possible when it wasn't submitted / used in an item yet)

As the code is currently, stackers could upload infinite amount of images to S3 for free. I thought about giving stackers some free images per day such that they don't have to think if uploading an image is worth the sats all the time. However, while writing this and trying to explain my reasoning, I realized it might actually be easier to code if every upload costs sats instead of the other way around because of less edge cases, lol

So image uploading could work similar to fee escalation for posting, just with a different slope. It could even be a staircase pattern: first 5 images per day cost 1 sat, then next 5 cost 10 sat, then next 5 100 etc.

The fee should also depend on the size however. So your first 5 images OR your first 5 MB cost 1 sat per image, whatever is hit first. If you reached 5 MB, you get escalated to the next level which is 10 sat (the image you currently want to upload should already be included in this calculation so you can't upload 100 MB and still pay 1 sat if you haven't uploaded 5 MB yet - a common programming mistake)

However, avatar uploading should still be free and thus an exception imo. This brings me to the next point:

  • use same key for avatars such that stackers can upload avatars still for free but we only store the latest one
    • follow-up: invalidate cache for avatars after update
  • drag and drop images
  • camera support
  • use trash instead of cross icon for delete
  • also delete image in S3 on delete by user
    but make sure delete only works if the URLs weren't submitted already! else there will be 404s for images in items
  • use FileReader API to show progress bar while uploading
  • make sure anons have to pay per image (1000 sats per image?)
    but submitted items should only show within a session, not across all anon sessions. but this opens up bad UX if anon pays for image but then doesn't include it in an item (= unsubmitted) and then forgets the image URL. mhhh 🤔
  • refactor: unify <Upload> in components/upload.js and <ImageUpload> in components/image.js
  • allow anons to use image uploads

New TODOs because of this conclusion:

  • move from upfront images fees for presigned URLs to image fees during item creation
  • remove client logic around unsubmitted images since if there are no more fees for presigned URLs, we don't need to keep track of unsubmitted images on the client - they can just upload another image if they lost an image link
  • periodically delete unused images (= images which weren't used in an item)
  • show image fees in frontend
  • refactor image queries into own wrapper like serializeInvoiceable: serializeImages? image queries are now run inside create_item and update_item
  • allow uploading multiple images at once
  • run image queries in same transaction as create_item ... need to access item id somehow though image queries are now run inside create_item and update_item
  • make sure fees show in frontend are in sync with actual fees calculated in backend

Description how new approach works:

  • Stackers can upload multiple files using presigned URLs. Getting these presigned URLs is free.
  • When the upload starts, [Uploading <file>...]() is pasted into the text input like in Github
  • When the upload finishes, this is replaced with the image URL
  • Additionally, the frontend fetches image fee info from backend to show any image fees to the user
  • 10 MB per 24 hours are free; anons always pay 100 sats per image. See postgres function image_fees_info.
  • On blur, the frontend also fetches image fee info since image URLs might have been deleted
  • On item creation, image_fees_info is called inside create_item to deduct image fees and update rows in the Upload table to mark images as paid (images can be reused in multiple items but only need to be paid once)
  • On item update, image_fees_info is called inside update_item to deduct image fees
  • image_fees_info knows which images were paid and which not using the Upload.paid column
  • if you drag files into the markdown text input, the border is highlighted in blue

@ekzyis ekzyis marked this pull request as draft October 19, 2023 18:29
@ekzyis ekzyis force-pushed the image-uploads branch 2 times, most recently from 652cbf1 to ab74ed6 Compare October 19, 2023 23:16
@ekzyis
Copy link
Member Author

ekzyis commented Oct 20, 2023

  • remove unique item id constraint on table Upload

There can be multiple uploads (= multiple images) per item now. However, this requires a data migration since jobs currently rely on there being only exactly one upload per item via Item.uploadId

I've decided to just drop the unique index. So no other changes required.

This might look weird in the schema now that an item can have a dedicated upload id and multiple uploads can point to it. The row in Upload with Upload.id = Item.uploadId might not even point to the item via itemId.

But this was already the case.

So I think this is ok to guarantee 100% backwards compatibility and if we don't like that, we can change that in another PR.

I hope dropping this index does not make any query slow. I will add checking that to the TODO list.

@ekzyis
Copy link
Member Author

ekzyis commented Oct 20, 2023

Regarding:

  • implement fees to receive presigned URLs

As the code is currently, stackers could upload infinite amount of images to S3 for free. I thought about giving stackers some free images per day such that they don't have to think if uploading an image is worth the sats all the time. However, while writing this and trying to explain my reasoning, I realized it might actually be easier to code if every upload costs sats instead of the other way around because of less edge cases, lol

So image uploading could work similar to fee escalation for posting, just with a different slope. It could even be a staircase pattern: first 5 images per day cost 1 sat, then next 5 cost 10 sat, then next 5 100 etc.

The fee should also depend on the size however. So your first 5 images OR your first 5 MB cost 1 sat per image, whatever is hit first. If you reached 5 MB, you get escalated to the next level which is 10 sat (the image you currently want to upload should already be included in this calculation so you can't upload 100 MB and still pay 1 sat if you haven't uploaded 5 MB yet - a common programming mistake)

We pay $0.023 per GB per month (for the first 50 TB).
$0.023 are 80 sats according to https://www.kraken.com/learn/satoshi-to-usd-converter (October 20, 2023).
This rounded up are 100 sats.
So we pay roughly 100 sats per GB per month (at the current exchange rate).
Per megabyte, this is 100/1024 = ~0.1 sats / MB
Since we will store the images indefinitely, we can let stackers pay for 12 months upfront.
So this makes ~1 sat / MB.

Still quite cheap.

I also realized that it doesn't make sense to limit amount of images. Only the size should matter.

For better numbers and to limit storage per stacker, I am therefore thinking about using the following algorithm to calculate cost per image:

// factor for bytes to megabyte
const MB = 1024 * 1024
// factor for msats to sats
const SATS = 1000

async function uploadCosts (models, userId, photoId, size) {
  let { _sum: { size: sumSize } } = await models.upload.aggregate({
    _sum: { size: true },
    where: { userId, createdAt: { gt: datePivot(new Date(), { days: -1 }) } }
  })
  // assume the image was already uploaded in the calculation
  sumSize += size
  if (sumSize <= 5 * MB) {
    return 0 * SATS
  }
  if (sumSize <= 10 * MB) {
    return 10 * SATS
  }
  if (sumSize <= 25 * MB) {
    return 100 * SATS
  }
  if (sumSize <= 100 * MB) {
    return 1000 * SATS
  }
  // check return value and throw error on -1?
  // or maybe not use -1 since in case of a bug, we might actually pay stackers to upload images, lol
  return -1 
}

@huumn
Copy link
Member

huumn commented Oct 20, 2023

Edit: Actually, forget that, proceed as you want. I'm probably lacking context and I should just let you do you thing.

@huumn
Copy link
Member

huumn commented Oct 20, 2023

I'm not sure if you get re-notified on edits, but I edited my above comment to be like "ignore me, never mind."

@ekzyis
Copy link
Member Author

ekzyis commented Oct 20, 2023

No, I don't get new notifications per edit and that's good because else you would have gotten a ton of notifications from me editing all my posts on Github, haha

Even though you said I should ignore what you said, I want to respond anyway since I think these are good points you mentioned and I think it's good to have them written down even if we already agree. Also, I can provide that context you might be missing:

We're not charging to make stackers pay for the hosting costs. We are charging to prevent abuse. The costs of even large images is negligible.

The question I'm trying to answer is when it starts to resemble abuse. Is it 25MB per day? 100MB per day? 1GB per day? Or something completely different?

Even ignoring that there is no protection against just spinning up new accounts.

Was also wondering about anon image uploads. That's going to be the most trickiest UX wise I think since they shouldn't see each others unsubmitted images vs. what if they pay but then forget to submit the image. The former is more for UX reasons than privacy reasons though since the images are public anyway.

No one wants to think about the size of the image they're uploading.

Yes, that's why I have decided to do a fee escalation in levels, but the size of the images increases the level. So not a constant rate sats/MB (or even sats/B) but escalating rate with a free quota per day.

I think if we do it like this instead of per image count, most stackers will probably never hit the first level with fees.

@huumn
Copy link
Member

huumn commented Oct 20, 2023

Was also wondering about anon image uploads. That's going to be the most trickiest UX wise I think since they shouldn't see each others unsubmitted images vs. what if they pay but then forget to submit the image. The former is more for UX reasons than privacy reasons though since the images are public anyway.

To zoom out a bit, how useful is it to showcase unsubmitted images? If we suppose it's not very useful, then we can just get rid of it for everyone solving this problem.

Re: UX

I think the way you've handled the UX is really nice in many ways, but could a github ux be better? With you current UX, it's a two step process of (1) uploading then (2) linking. Github's is a one step process.

Are there good reasons to not do it that way?

This ties in with the unsubmitted issue above some I think.

@ekzyis
Copy link
Member Author

ekzyis commented Oct 20, 2023

Are there good reasons to not do it that way

The only reason I'm doing it this way is because if they pay, they pay immediately, not only when they submitted an image.

If we do it like Github, stackers might accidentally delete the link. Then they wasted sats because now they don't know anymore for which link they paid.

So I show them all images they haven't submitted yet including the ability to get a refund if they delete the image.

And I used pay immediately for presigned URL since else it seemed unnecessary complex to detect if URLs were just not submitted yet or if they are just abusing our storage without paying for it.

So I could implement the payment logic within create_item but it's harder to secure since images were already uploaded.

And I wanted to keep client upload with presigned URLs to not diverge too far from what we're already doing with avatars.

@huumn
Copy link
Member

huumn commented Oct 20, 2023

I see. I'm not sure if what I was thinking makes sense in light of some of these problems, but here's how I always thought it would work:

  1. you're writing a post/comment
  2. add an image (works like github)
  3. when you submit, we parse the text and associate any sn uploaded images with the item
  4. after 24 hours (or some other period of time), if we have any images hosted that aren't associated with a user/item, then we delete it

We could even be more or less aggressive with the deletion timing depending on other factors like how often the stacker does this, or how much trust they have.

So I could implement the payment logic within create_item but it's harder to secure since images were already uploaded.

Can you elaborate on the difficulties of securing images this way?

To be clear, the way you've done it is really nice. It's just trading some UX for other things and I want to make sure we really need the other things.

@ekzyis
Copy link
Member Author

ekzyis commented Oct 21, 2023

I see. I'm not sure if what I was thinking makes sense in light of some of these problems, but here's how I always thought it would work:

  1. you're writing a post/comment
  2. add an image (works like github)
  3. when you submit, we parse the text and associate any sn uploaded images with the item
  4. after 24 hours (or some other period of time), if we have any images hosted that aren't associated with a user/item, then we delete it

That's also how I imagined it to work initially and I actually would love to do it like this.

But then I thought about how I would abuse this. I would upload as much images as I can with zero variable costs to me. This ignores bandwidth and other factors since most likely, an attacker already pays for it with a flat fee; so it's independent of how much they exactly use these resources. (You could potentially even argue that not using these resources to attack is actually the cost because of opportunity costs, lol)

Yes, we will delete them at some point if they weren't used to post on SN, but the damage might already have been done in forms of storage costs or known and unknown unknowns.

And yes, this attack might be more theoretically since what exactly has the adversary to gain? Being able to host a lot of images for a limited amount of time? I am not sure if that is a valuable goal, but I wanted to err on the side of caution here. Basically for "Fear of the Unknown"-reasons. So if it's possible to completely shut down such attacks by introducing variable costs, why not do it?

But thanks to the discussion with you, it's clear now how much I am trading UX for security.

UX problems if we introduce immediate variable costs (pay per presigned URL):

  • we now need to add refunds if we not want to be a simple hosting service but only want to allow image uploads to post them on SN. solution: show unsubmitted images to delete for a refund or still use them
  • how to deal with unsubmitted images of anons? we don't want to show unsubmitted images from anons to every anon for UX reasons (not necessary for privacy reasons).
  • paying something before you post is a new pattern which we have to explain
  • unsubmitted images show up below every text input, so they always show up beneath text input to reply to the root item + if you click on reply in a comment

Also, since we want to allow refunds (imo), this actually introduces a new, potentially even more serious risk: we add code which increases the balance of users. So instead of potentially being able to store a lot of images, in case of a bug, it would now be possible to steal funds, lol

So I think after having explored this "pay per presigned URL" approach enough, I would also prefer the "Github way". But just want to mention that:

With you current UX, it's a two step process of (1) uploading then (2) linking. Github's is a one step process.

Is not inherent to the current approach. I could immediately link them on upload and show them as unsubmitted. It's just work I didn't put in yet and I wasn't 100% sure yet if it's worth the work of more cursor tracking code (I think this means we need to refactor some stuff so we can reuse existing cursor tracking code to paste content at the cursor easily). But funnily, my showcase actually demonstrates the UX problem quite well since I clicked on "reply" but I didn't link the image yet, haha

So I'll use this approach now: Stackers don't have to pay and we can now just paste the image link without the UX risk of stackers wasting their sats. If they accidentally delete the link, they can just do a new upload for free.

And then during create_item, we deduct image fees if applicable.

Can you elaborate on the difficulties of securing images this way?

Only two reasons:

  • we can't enforce variable costs this way
  • I need to update existing code rather than write new code. I prefer to write code in layers so existing, battle-tested code can work as it did before. And putting this logic explicit for uploaded images into create_item seemed unnecessarily more complex to me than just dealing with all this image upload stuff in a completely different layer. Which makes it again less secure because of more complexity.

@huumn
Copy link
Member

huumn commented Oct 21, 2023

Your initial approach is really nice and demonstrates your ability to think through all the adversary's moves. Regardless of the exact approach we take (here or elsewhere) we need to forever be capable of doing this exercise.

In our efforts to starve our enemies though, we have to be careful to not starve our allies.

So I'll use this approach now: Stackers don't have to pay and we can now just paste the image link without the UX risk of stackers wasting their sats. If they accidentally delete the link, they can just do a new upload for free.

And then during create_item, we deduct image fees if applicable.

Cool, I do think this is the right approach. I think we can make the incentives for abuse low enough that we don't have to be too concerned about taking upfront variable fees.

If we're wrong, we can reconsider.


Whenever introducing fees comes up, I think of this coin operated park bench

bGXgQQfV6juS_KueETppqhaBKBSHYdzvOMcvNqA_vIk

Adding fees should strictly increase the UX of SN. If it's not clear we're making SN better by introducing a fee, I'd rather we get some real life experience with the fee-free version (which should make it clear if the fee is needed). In the general case, fees make SN's UX worse so it's worth being very selective with them.

I should probably start writing some of these principles down.

@ekzyis ekzyis force-pushed the image-uploads branch 2 times, most recently from b0f5914 to d679485 Compare October 25, 2023 02:57
@ekzyis
Copy link
Member Author

ekzyis commented Oct 26, 2023

This is done now. Only testing and code cleanup is left (see PR description). (I'll keep this in draft until I've tested and refactored the code. Will do this tomorrow.)

  1. upload multiple files at once; anon always pays 100 sats per image:
2023-10-26.02-23-39.mp4
  1. stackers have 10MB free upload per day, then every image starts to cost 10 sats until 25MB; 100 sats until 50MB; 1000 sats until 100MB and then 10k sats. See algorithm here:
2023-10-26.02-35-33.mp4
  1. during editing, only new images (images which weren't paid yet) are considered (also, UX with updating image fees on blur is not ideal):
2023-10-26.03-03-22.mp4
  1. drag and drop is supported:
2023-10-26.03-13-26.mp4
  1. camera is automatically supported on mobile file explorer

  2. unused images are regularly deleted using pgboss.schedule

I think main areas which can be improved is the UX around image fees.

I currently don't show in the receipts when fees per image will start or rise. I made image fees dependent on size (so a specific size must be hit before the next level of fees per image is reached) since I think this way, most stackers won't exhaust their free quota per day compared to a free quota of images based on image amount (and not size). Also, if we limit per image amount, using fees as defense against abuse doesn't really work.

Also, I update image fees in the frontend on upload (images were added) and on text input blur (images might have been removed). Checking if images were removed on blur is not ideal. Maybe I should update fees on every text change but debounced?

edit: I noticed some image fees are not properly shown in the frontend. For example, in the second video, the frontend shows 40 sats but then 100 sats are deducted. Need to look into this. Created TODO in PR description.

api/resolvers/item.js Outdated Show resolved Hide resolved
api/resolvers/item.js Outdated Show resolved Hide resolved
Comment on lines +110 to +113
<tr>
<td>{numWithUnits(0, { abbreviate: false })}</td>
<td align='right' className='font-weight-light'>edit fee</td>
</tr>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added 0 sats edit fee so I can use + X sats imageFees ... Also, I might should not have removed the usage of paidSats in the edit receipt.

Copy link
Member Author

@ekzyis ekzyis Oct 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example receipts:

reply:

2023-10-27-031624_509x218_scrot

edit:

2023-10-27-031644_506x188_scrot

I realized paidSats wasn't used before since addImgLink was not used.

So I think it's not a breaking UX change. I think it's fine to mention that editing itself does not cost anything but fees were added because of images.

What might be confusing though is why image suddenly cost something since they don't cost anything until 10 MB were uploaded within the last 24 hours. There is no mention of this in the receipt.

I played a bit around with a ProgressBar, to fix this but that also didn't feel right. Maybe mentioning it in the FAQ is enough? Or a info box in the receipt? But then we have a info box inside a info box, lol.

components/image.js Outdated Show resolved Hide resolved
components/job-form.js Outdated Show resolved Hide resolved
lib/constants.js Outdated Show resolved Hide resolved
@ekzyis
Copy link
Member Author

ekzyis commented Oct 27, 2023

Checking if images were removed on blur is not ideal. Maybe I should update fees on every text change but debounced?

Using debounce with 1 second debouncing now. The only UX problem is now that the user might hit reply before the fees are updated and then more fees might be deducted in the backend without the user knowing in advance.

We could disable the reply button until the image fee info was updated; similar to how the dupe check works for links (post is disabled until dupes are found or 3 sec timeout is hit).

But I think disabling the reply button after one second for every text change is going to be annoying.

Thinking about a pure client-side implementation of the fee calculation so we don't need to do a request at all. We can probably assume that the user is not typing multiple items at once so we can assume that during one "item session", the state of unpaid images does not change. Paid images always stay paid.

@ekzyis
Copy link
Member Author

ekzyis commented Oct 27, 2023

Thinking about a pure client-side implementation of the fee calculation so we don't need to do a request at all. We can probably assume that the user is not typing multiple items at once so we can assume that during one "item session", the state of unpaid images does not change. Paid images always stay paid.

Putting this out of draft now since only this and the lacking info in the receipts can be considered missing and I believe these are not blocking UX things. They can be handled in a follow-up PR imo.

@ekzyis ekzyis marked this pull request as ready for review October 27, 2023 01:38
@ekzyis
Copy link
Member Author

ekzyis commented Oct 27, 2023

Another thing:

there is something interesting, if I upload multiple photos at the same time, it's costing more sats to post, like 5 photos for 51 sats, or is it because it exceed the 10 MB?

The moment you hit 10MB, every image will cost 10 sats - even the ones which would have been free if you didn't hit 10MB in your post. So yes, that's indeed how it works. It's not ideal but it was easier to implement like this
One of the missing UX stuff was to communicate this well but I consider this to not be blocking.
I can also look into making this more fair but then we need to do some crazy math to find the best combination of images which should be free, lol.

@ekzyis ekzyis marked this pull request as ready for review November 2, 2023 00:50
@ekzyis ekzyis requested a review from huumn November 2, 2023 00:50
@huumn
Copy link
Member

huumn commented Nov 5, 2023

Why'd you remove itemId from Upload?

It seems like useful metadata to keep. Imagine we have a profile tab for media that a stacker has uploaded. We could list all their uploads, but how do we link to the context it was shared in given we removed the relationship?

@ekzyis
Copy link
Member Author

ekzyis commented Nov 5, 2023

Why'd you remove itemId from Upload?

Because I realized that for the fee calculation, I only needed to know if it was already paid instead of knowing in which items it was linked.

Previously, the column wasn't used and it was only a one-to-one-relationship. So to continue such a column, it becomes a many-to-many relationship: one item can have multiple uploads, and the same upload can be included in multiple items.

So I decided to replace the column with a simple boolean column instead of maintaining a many-to-many relationship which seemed unnecessary complex for what I was trying to achieve.

In your commit, this line replaces any previous existing upload id with the new one. So this doesn't track all links, only the latest link from an upload to an item.

Copy link
Member

@huumn huumn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated your code to keep itemIds on Upload then realized your code has bugs. 😄

Please fix the bugs on top of my changes.


To speak broadly on changes, when we change something generic we need to consider everything else that uses that code. Form is such a generic thing. Ideally updates to Form should be made in a way that they have no effect on existing things that use it.

When I can't avoid making fundamental changes to a piece of code meant to be generic: I search for everything that uses it, walk through it still working logically after the changes, and manually test it just in case I did something unintentional.

Spaghetti code happens when we fail to make our changes isolated and work with generic parts.

components/form.js Outdated Show resolved Hide resolved
@huumn
Copy link
Member

huumn commented Nov 5, 2023

In your commit, this line replaces any previous existing upload id with the new one. So this doesn't track all links, only the latest link from an upload to an item.

This means we lose the original itemId right? I think that's okay but you're right it's not ideal. Ideally I guess we'd create a many-to-many relationship between Item and Upload so that we have this metadata when we want to use it later.

@ekzyis
Copy link
Member Author

ekzyis commented Nov 5, 2023

This means we lose the original itemId right? I think that's okay but you're right it's not ideal. Ideally I guess we'd create a many-to-many relationship between Item and Upload so that we have this metadata when we want to use it later.

It also breaks trying to insert two images into the same post because of the unique index.

So I think we should replace this column since it's not complete anyway and create a table to maintain this many-to-many relationship

@huumn
Copy link
Member

huumn commented Nov 5, 2023

It also breaks trying to insert two images into the same post because of the unique index.

Oh, I see! Cool, obliterate my changes.

Edit: I'll fix my own changes

@huumn
Copy link
Member

huumn commented Nov 5, 2023

Sorry if I seem impatient. Totally not personal.

I've just been in a bit of a mood all day and I left myself a bunch of code I promised to review 😄

I'll give your code another shot and add the many-to-many relationship. Thanks!

@huumn
Copy link
Member

huumn commented Nov 5, 2023

my todos:

  • many to many item<->upload
  • give upload urls a pretty domain name by aliasing the bucket

@huumn
Copy link
Member

huumn commented Nov 6, 2023

This is excellent.

Can you walk me through why we have static avatar upload ids again? It's causing issues for me possibly because my bucket is configured slightly differently, also see below, and I'm wondering if we can figure out a way to not need to be conscious of caching (I know you know how hard it is to debug that stuff).

Under the following circumstance, won't I see stale profile images?

  1. I visit your profile
  2. cache your profile image
  3. you upload a new one
  4. When I visit your profile again, won't I get served the stale image?
    • you cache bust it for the uploader, but we can't control everyone else's cache

Reasons I can imagine we have static avatar ids (with possible alternatives):

  1. we don't want to charge fees
    • we can exclude profile images in the image fee function
  2. we want to make sure they aren't deleted
    • we can exclude active profile images in the delete function

But I might be missing something.

@huumn
Copy link
Member

huumn commented Nov 6, 2023

Other than that, this is good to go! I haven't tested the fee part yet, but we have some other pending PRs that involve FeeButton so I'll try to QA all of that at once.

@ekzyis
Copy link
Member Author

ekzyis commented Nov 6, 2023

This is excellent.

Can you walk me through why we have static avatar upload ids again? It's causing issues for me possibly because my bucket is configured slightly differently, also see below, and I'm wondering if we can figure out a way to not need to be conscious of caching (I know you know how hard it is to debug that stuff).

Under the following circumstance, won't I see stale profile images?

  1. I visit your profile

  2. cache your profile image

  3. you upload a new one

  4. When I visit your profile again, won't I get served the stale image?

    • you cache bust it for the uploader, but we can't control everyone else's cache

Mhh, I didn't test it with two users but only with one. I expected it wouldn't already work with one user since on page reload, the browser would fetch the URL without the random parameter and thus it may use the cached one, as you mentioned with

you cache bust it for the uploader, but we can't control everyone else's cache

But it didn't, it used the up-to-date one even though there were Cache-Control headers present. But maybe it's still this:

It's causing issues for me possibly because my bucket is configured slightly differently

And maybe testing this stuff out locally is irrelevant since we use a CDN in production and that's where the caching issues will appear. Not on my local build which talks straight to S3.

Reasons I can imagine we have static avatar ids (with possible alternatives):

  1. we don't want to charge fees

    • we can exclude profile images in the image fee function
  2. we want to make sure they aren't deleted

    • we can exclude active profile images in the delete function

But I might be missing something.

Yes, the reason for static avatar ids was to keep them free and thus the storage they can occupy should be limited.

That they aren't deleted is handled by only deleting the rows with paid = 'f'. Avatars have NULL in this row.

To prevent any cache issues, we could give them a new id on every upload but limit the amount of free storage in other, free ways like applying time limits?

@huumn huumn merged commit 8f59042 into stackernews:master Nov 6, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants