File Add Safety and Malware Safety

May 28, 2023

352

In the present day we’re going to be wrapping up this sequence on file uploads for the online. When you’ve been following alongside, you must now be accustomed to enabling file uploads on the entrance finish and the again finish. We’ve lined architectural selections to cut back value on the place we host our recordsdata and enhance the supply efficiency. So I believed we might wrap up the sequence at the moment by overlaying safety because it pertains to file uploads.

In case you’d like to return and revisit any earlier blogs within the sequence, right here’s a listing of what we’ve lined thus far:

Introduction

Anytime I focus on the subject of safety, I wish to seek the advice of the consultants at OWASP.org. Conveniently, they’ve a File Add Cheat Sheet, which outlines a number of assault vectors associated to file uploads and steps to mitigate them.

In the present day we’ll stroll by means of this cheat sheet and learn how to implement a few of their suggestions into an current software.

For a little bit of background, the appliance has a frontend with a type that has a single file enter that uploads that file to a backend.

The backend is powered by Nuxt.js‘ Occasion Handler API, which receives an incoming request as an “occasion” object, detects whether or not it’s a multipart/form-data request (all the time true for file uploads), and passes the underlying Node.js request object (or IncomingMessage) to this tradition perform known as parseMultipartNodeRequest.

import formidable from 'formidable';

/* international defineEventHandler, getRequestHeaders, readBody */

/**
 * @see https://nuxt.com/docs/information/ideas/server-engine
 * @see https://github.com/unjs/h3
 */
export default defineEventHandler(async (occasion) => {
  let physique;
  const headers = getRequestHeaders(occasion);

  if (headers['content-type']?.consists of('multipart/form-data')) {
    physique = await parseMultipartNodeRequest(occasion.node.req);
  } else {
    physique = await readBody(occasion);
  }
  console.log(physique);

  return { okay: true };
});

All of the code we’ll be specializing in at the moment will reside inside this parseMultipartNodeRequest perform. And since it really works with the Node.js primitives, the whole lot we do ought to work in any Node surroundings, no matter whether or not you’re utilizing Nuxt or Subsequent or another type of framework or library.

Inside parseMultipartNodeRequest we:

Create a brand new Promise
Instantiate a multipart/form-data parser utilizing a library known as formidable
Parse the incoming Node request object
The parser writes recordsdata to their storage location
The parser offers details about the fields and the recordsdata within the request

As soon as it’s accomplished parsing, we resolve parseMultipartNodeRequest‘s Promise with the fields and recordsdata.

/**
 * @param {import('http').IncomingMessage} req
 */
perform parseMultipartNodeRequest(req) {
  return new Promise((resolve, reject) => {
    const type = formidable({
      multiples: true,
    });
    type.parse(req, (error, fields, recordsdata) => {
      if (error) {
        reject(error);
        return;
      }
      resolve({ ...fields, ...recordsdata });
    });
  });
}

That’s what we’re beginning with at the moment, however if you need a greater understanding of the low-level ideas for dealing with multipart/form-data requests in Node, try, “Dealing with File Uploads on the Backend in Node.js (& Nuxt).” It covers low stage subjects like chunks, streams, and buffers, then reveals learn how to use a library as an alternative of writing one from scratch.

Securing Uploads

With our app arrange and working, we will begin to implement a few of the suggestions from OWASP’s cheat sheet.

Extension Validation

With this method, we verify the importing file title extensions and solely permit recordsdata with the allowed extension sorts into our system.

Thankfully, that is fairly straightforward to implement with formidable. Once we initialize the library, we will cross a filter configuration possibility which ought to be a perform that has entry to a file object parameter that gives some particulars in regards to the file, together with the unique file title. The perform should return a boolean that tells formidable whether or not to permit writing it to the storage location or not.

const type = formidable({
  // different config choices
  filter(file) {
    // filter logic right here
  }
});

We might verify file.originalFileName in opposition to a daily expression that checks whether or not a string ends with one of many allowed file extensions. For any add that doesn’t cross the check, we will return false to inform daunting to skip that file and for the whole lot else, we will return true to inform formidable to write down the file to the system.

const type = formidable({
  // different config choices
  filter(file) {
    const originalFilename = file.originalFilename ?? '';
    // Implement file ends with allowed extension
    const allowedExtensions = /.(jpe?g|png|gif|avif|webp|svg|txt)$/i;
    if (!allowedExtensions.check(originalFilename)) {
      return false;
    }
    return true;
  }
});

Filename Sanitization

Filename sanitization is an efficient strategy to defend in opposition to file names which may be too lengthy or embrace characters that aren’t acceptable for the working system.

The advice is to generate a brand new filename for any add. Some choices could also be a random string generator, a UUID, or some type of hash.

As soon as once more, formidable makes this straightforward for us by offering a filename configuration possibility. And as soon as once more it ought to be a perform that gives particulars in regards to the file, however this time it expects a string.

const type = formidable({
  // different config choices
  filename(file) {
    // return some random string
  },
});

We are able to really skip this step as a result of formidable’s default habits is to generate a random hash for each add. So we’re already following greatest practices simply through the use of the default settings.

Add and Obtain Limits

Subsequent, we’ll deal with add limits. This protects our software from working out of storage, limits how a lot we pay for storage, and limits how a lot information may very well be transferred if these recordsdata get downloaded, which can additionally have an effect on how a lot we’ve got to pay.

As soon as once more, we get some fundamental safety simply through the use of formidable as a result of it units a default worth of 200 megabytes as the utmost file add measurement.

If we would like, we might override that worth with a customized maxFileSize configuration possibility. For instance, we might set it to 10 megabytes like this:

const type = formidable({
  // different config choices
  maxFileSize: 1024 * 1024 * 10,
});

The correct worth to decide on is extremely subjective primarily based in your software wants. For instance, an software that accepts high-definition video recordsdata will want a a lot greater restrict than one which expects solely PDFs.

You’ll need to select the bottom conservative worth with out being so low that it hinders regular customers.

File Storage Location

It’s vital to be intentional about the place uploaded recordsdata get saved. The highest suggestion is to retailer uploaded recordsdata in a totally totally different location than the place your software server is working.

That manner, if malware does get into the system, it’s going to nonetheless be quarantined with out entry to the working software. This may forestall entry to delicate consumer data, surroundings variables, and extra.

In considered one of my earlier posts, “Stream File Uploads to S3 Object Storage and Cut back Prices,” I confirmed learn how to stream file uploads to an object storage supplier. So it’s not solely more cost effective, but it surely’s additionally safer.

But when storing recordsdata on a special host isn’t an possibility, the following neatest thing we will do is make it possible for uploaded recordsdata don’t find yourself within the root folder on the appliance server.

Once more, formidable handles this by default. It shops any uploaded recordsdata within the working system’s temp folder. That’s good for safety, however if you wish to entry these recordsdata afterward, the temp folder might be not one of the best place to retailer them.

Thankfully, there’s one other formidable configuration setting known as uploadDir to explicitly set the add location. It may be both a relative path or an absolute path.

So, for instance, I could need to retailer recordsdata in a folder known as “/uploads” inside my challenge folder. This folder should exist already, and if I need to use a relative path, it have to be relative to the appliance runtime (normally the challenge root). That being the case, I can set the config possibility like this:

const type = formidable({
  // different config choices
  uploadDir: './uploads',
});

Content material-Sort Validation

Content material-Sort validation is vital to make sure that the uploaded recordsdata match a given record of allowed MIME-types. It’s just like extension validation, but it surely’s vital to additionally verify a file’s MIME-type as a result of it’s straightforward for an attacker to easily rename a file to incorporate a file extension that’s in our allowed record.

Trying again at formidable’s filter perform, we’ll see that it additionally offers us with the file’s MIME-type. So we might add some logic enforces the file MIME-type matches our permit record.

We might modify our previous perform to additionally filter out any add that isn’t a picture.

const type = formidable({
  // different config choices
  filter(file) {
    const originalFilename = file.originalFilename ?? '';
    // Implement file ends with allowed extension
    const allowedExtensions = /.(jpe?g|png|gif|avif|webp|svg|txt)$/i;
    if (!allowedExtensions.check(originalFilename)) {
      return false;
    }
    const mimetype = file.mimetype ?? '';
    // Implement file makes use of allowed mimetype
    return Boolean(mimetype && (mimetype.consists of('picture')));
  }
});

Now, this might be nice in principle, however the actuality is that formidable really generates the file’s MIME-type data primarily based on the file extension.

That makes it no extra helpful than our extension validation. It’s unlucky, but it surely additionally is smart and is prone to stay the case.

formidable’s filter perform is designed to stop recordsdata from being written to disk. It runs because it’s parsing uploads. However the one dependable strategy to know a file’s MIME-type is by checking the file’s contents. And you may solely do this after the file has already been written to the disk.

So we technically haven’t solved this concern but, however checking file contents really brings us to the following concern, file content material validation.

Intermission

Earlier than we get into that, let’s verify the present performance. I can add a number of recordsdata, together with three JPEGs and one textual content file (word that one of many JPEGs is kind of giant).

After I add this record of recordsdata, I’ll get a failed request with a standing code of 500. The server console stories the error is as a result of the utmost allowed file measurement was exceeded.

Server console reporting the error, "[nuxt] [request error] [unhandled] [500] options.maxFileSize (10485760 bytes) exceeded, received 10490143 bytes of file data"

That is good.

We’ve prevented a file from being uploaded into our system that exceeds the utmost file restrict measurement (we must always most likely do a greater job of dealing with errors on the backend, however that’s a job for one more day).

Now, what occurs once we add all these recordsdata besides the massive one?

No error.

And looking out within the “uploads” folder, we’ll see that regardless of importing three recordsdata, solely two have been saved. The .txt file didn’t get previous our file extension filter.

We’ll additionally discover that the names of the 2 saved recordsdata are random hash values. As soon as once more, that’s due to formidable default habits.

Now there’s only one downside. A kind of two profitable uploads got here from the “bad-dog.jpeg” file I chosen. That file was really a replica of the “bad-dog.txt” that I renamed. And THAT file really incorporates malware 😱😱😱

We are able to show it by working some of the standard Linux antivirus instruments on the uploads folder, ClamScan. Sure, ClamScan is an actual factor. Sure, that’s its actual title. No, I don’t know why they known as it that. Sure, I do know what it seems like.

(Facet word: The file I used was created for testing malware software program. So it’s innocent, but it surely’s designed to set off malware scanners. However that meant I needed to get round browser warnings, virus scanner warnings, firewall blockers, AND offended emails from our IT division simply to get a replica. So that you higher study one thing.)

OK, now let’s speak about file content material validation.

File Content material Validation

File content material validation is a flowery manner of claiming, “scan the file for malware”, and it’s one of many extra vital safety steps you possibly can take when accepting file uploads.

We used ClamScan above, so now you is perhaps considering, “Aha, why don’t I simply scan the recordsdata as formidable parses them?”

Much like MIME-type checking, malware scanning can solely occur after the file has already been written to disc. Moreover, scanning file contents can take a very long time. Far longer than is suitable in a request-response cycle. You wouldn’t need to maintain the consumer ready that lengthy.

So we’ve got two potential issues:

By the point we will begin scanning a file for malware, it’s already on our server.
We are able to’t await scans to complete earlier than responding to consumer’s add requests.

Bummer…

Malware Scanning Structure

Working a malware scan on each single add request might be not an possibility, however there are answers. Keep in mind that the purpose is to guard our software from malicious uploads in addition to to guard our customers from malicious downloads.

As a substitute of scanning uploads through the request-response cycle, we might settle for all uploaded recordsdata, retailer them in a protected location, and add a report in a database containing the file’s metadata, storage location, and a flag to trace whether or not the file has been scanned.

Subsequent, we might schedule a background course of that locates and scans all of the information within the database for unscanned recordsdata. If it finds any malware, it might take away it, quarantine it, and/or notify us. For all of the clear recordsdata, it might probably replace their respective database information to mark them as scanned.

Lastly, there are issues to make for the entrance finish. We’ll probably need to present any beforehand uploaded recordsdata, however we’ve got to watch out about offering entry to probably harmful ones. Listed here are a pair totally different choices:

After an add, solely present the file data to the consumer that uploaded it, letting them know that it gained’t be obtainable to others till after it’s been scanned. It’s possible you’ll even electronic mail them when it’s full.
After an add, present the file to each consumer, however don’t present a strategy to obtain the file till after it has been scanned. Embrace some messaging to inform customers the file is pending a scan, however they will nonetheless see the file’s metadata.

Which possibility is best for you actually is dependent upon your software use case. And naturally, these examples assume your software already has a database and the flexibility to schedule background duties.

It’s additionally price mentioning right here that one of many OWASP suggestions was to restrict file add capabilities to authenticated customers. This makes it simpler to trace and stop abuse.

Sadly, databases, consumer accounts, and background duties all require extra time than I’ve to cowl in at the moment’s article, however I hope these ideas offer you extra concepts on how one can enhance your add safety strategies.

Block Malware on the Edge

Earlier than we end up at the moment, there’s another factor that I need to point out. When you’re an Akamai buyer, you even have entry to a malware safety characteristic as a part of the online software firewall merchandise. I received to mess around with it briefly and need to present it off as a result of it’s tremendous cool.

I’ve an software up and working at uploader.austingil.com. It’s already built-in with Akamai’s Ion CDN, so it was straightforward to additionally set it up with a safety configuration that features IP/Geo Firewall, Denial of Service safety, WAF, and Malware Safety.

I configured the Malware Safety coverage to only deny any request containing malware or a content material sort mismatch.

Now, if I am going to my software and attempt to add a file that has identified malware, I’ll see nearly instantly the response is rejected with a 403 standing code.

To be clear, that’s logic I didn’t really write into my software. That’s occurring due to Akamai’s malware safety, and I actually like this product for quite a few causes.

It’s handy and straightforward to arrange and modify from inside the Akamai UI.
I like that I don’t have to change my software to combine the product.
It does its job nicely and I don’t should handle upkeep on it.
Final, however not least, the recordsdata are scanned on Akamai’s edge servers, which implies it’s not solely sooner, but it surely additionally retains blocked malware from ever even reaching my servers. That is most likely my favourite characteristic.

Attributable to time and useful resource restrictions, I feel Malware Safety can solely scan recordsdata as much as a sure measurement, so it gained’t work for the whole lot, but it surely’s an important addition for blocking some recordsdata from even entering into your system.

Closing Ideas

It’s vital to keep in mind that there is no such thing as a one-and-done answer in relation to safety. Every of the steps we lined have their very own professionals and cons, and it’s typically a good suggestion so as to add a number of layers of safety to your software.

Okay, that’s going to wrap up this sequence on file uploads for the online. When you haven’t but, contemplate studying a few of the different articles.

Please let me know when you discovered this handy, or when you’ve got concepts on different sequence you’d like me to cowl. I’d love to listen to from you.

Thanks a lot for studying. When you favored this text, and need to assist me, one of the best methods to take action are to share it, join my publication, and observe me on Twitter.

Initially revealed on austingil.com.

Previous articleWell being Verify Response Format for HTTP APIs

Next articleWhen Scaling Is Not An Choice: A Easy Asynchronous Sample | by Mario Bittencourt | Might, 2023

File Add Safety and Malware Safety

Introduction

Securing Uploads

Extension Validation

Filename Sanitization

Add and Obtain Limits

File Storage Location

Content material-Sort Validation

Intermission

File Content material Validation

Malware Scanning Structure

Block Malware on the Edge

Closing Ideas

Grasp Picture Processing in Node.js Utilizing Sharp for Quick Internet Apps

JavaScript Weekly Situation 739: June 6, 2025

Internet Weekly #159 | Stefan Judis Internet Growth

LEAVE A REPLY Cancel reply

Most Popular

#CoffeeWithRW: from Tech Author to Analytics Engineer

The Delegate RequestDelegate doesn’t take X arguments – Experiences with minimal APIs – blogs.cninnovation.com

Eleventy Starter Mission Updates

Tips on how to Set up an Entry Level

Recent Comments

ABOUT US

POPULAR POSTS

#CoffeeWithRW: from Tech Author to Analytics Engineer

The Delegate RequestDelegate doesn’t take X arguments – Experiences with minimal APIs – blogs.cninnovation.com

Eleventy Starter Mission Updates

POPULAR CATEGORY