I understand and it sucks about the lack of response on the Hangfire forum…however the problem you’ve posted is about 70% ASP.NET, 20% architecture and 10% Hangfire. This isn’t a super active forum and a lot of similar or unrelated questions get asked a lot.
Web programming is often thought of as simple but there’s always multiple modalities in play (browser, server, database) and its very important that each boundary is understood. There’s lots of good courses out there that can teach you more about the interactions between browser and server which would probably be a worthwhile time investment if you want to ensure you write a secure app. I’d bet you’d pick it up relatively quickly as it’s not rocket science, just different than writing applications or mainframe code.
When you’re programming at the web server level, you have to understand that the only interaction with the browser you have is the data that is sent via the browser’s TCP request and the corresponding TCP response that you want to send back. You don’t actually have access to the user’s browser directly, or any of the resources on the user’s machine.
In the simplest case, when a user on a browser picks a file, maybe fills in some form fields and clicks the submit button on their form, a TCP request is made from the browser to the web server. In your case, this is a “multipart-form” request which looks like some string data with blocks of binary data in between, plus some header data, url, etc. Then, generally, the browser sits there waiting for the response from the server, with several timeouts (both server and browser) that may result in the waiting being canceled.
When the request hits the web server, there’s a pipeline of different “modules” that the request will flow through, eventually landing in ASPNET. ASPNET is kind enough to take care of turning that TCP data into an IFormFile abstraction and saves you the work of reading and splitting up the stream of data in the TCP request. It then gives you the reins to decide what you want to do next with the data and what response you want to send back to the user’s browser.
As a best practice you don’t want to do any long processing in your controller method:
- The user experience sucks for the user making the browser request (they’re sitting there waiting)
- You run the risk of the request timing out or IIS app pool restart which will result in incomplete work
In the BH times (before Hangfire), the systems architecture for this problem would be much more complicated. You’d still be saving the file, saving state info to the database, but then you’d pass the information via some out of process communication to a windows service (daemon) that would then do the work. Hangfire, allows us to skip all the complexity of a 4th process/modality, but not the requirements for good async programming.
As for the other concerns and problems you mentioned:
- File Upload concerns are valid I guess. DoS is always a potential problem. To prevent Upload viruses/compromise networks and servers: Restrict the file upload extension type (this information is on the IFileUpload object), don’t run ASPNET core from an elevated user permission, don’t run the file, validate the contents look the way you expect, employ virus scan on your working directories, delete the file when you’re done with it
- Saving state is always going to be required for async processing. We typically use a pattern like: create a “Working Directory”, save file to Working Directory, save state about request/file to database table(s), queue up Background Job with the primary key for the state data row(s), return the primary key as the http response. When we do our processing work, we may also save additional progress/logging state back to that table to understand what to do if the Background Job is restarted. Processing state might include start/finish times, current batch of records being uploaded, success/failure codes/logs, etc.
- Communicating back to the user is very application specific requirement. Maybe you send an email. Maybe you have a piece of web user interface that polls your web server while the request is being processed and asks about the state you stored in step 2 until it is complete/error and presents that processing state to the user. There’s a bunch of different approaches to this problem.
I don’t know of an example code you could use for the whole solution, but I would start with getting your Web Server to save the file and the state code. Once you’ve got a file sitting there you can process and some state data, I think the challenge of queuing up a Background Job and using the state data to determine your processing logic will seem a lot more surmountable (and maybe similar to mainframe logic)
Hope this helps.