Scaling Pixel Art in Video Games

A common problem in video games is how to scale up pixel art to adapt it to the video display in a non-destructive manner. Here are a couple of interesting articles on how to achieve this via dedicated shader.

Both articles approach the subject in depth, explaining the problem and showing the pros and cons of different approaches, and finally propose their good solution via shaders. Code samples and theory included.

Any further links to similar articles and code examples are appreciated — especially links providing solutions for specific engines and graphic libraries.

1 Like

I haven’t encountered any problems while importing and using pixel art in Unity.

The only two things I do after importing the sprite is to set the Filtering Mode to Point (No Filtering) and to set the Compression to None.

Unity is a solid game engine, and from what I read it’s known to work well with pixel art. Other engines might pose some challenges though, for they try to scale the game canvas to occupy the full screen, often using algorithms which don’t work well with pixel art.

What I find interesting in the above articles is the use of shaders to handle upscaling, which is a creative approach that should work well on most sharders supporting engines. The screenshots in the articles show big differences in quality (SDL2 library was the target of the article, which is used by various popular game engines).

tajmone, Your teqnique works well for scaling up graphics but does it work for scaling them down?

@ Morintari:

Your teqnique works well for scaling up graphics but does it work for scaling them down?

The technique proposed in those articles (authored by third parties) is only for scaling up. I have never considered before scaling down pixel art — although it might occur in some contexts, like zooming out on a level to offer a bird-eye view of a large level, for example. My guess is that the predefined scaling algorithms should be OK for scaling down, but haven’t really tried it.

Scaling down is always going to be a lossy operation though, for you’ll have to reduce the actual pixels. Usually this is handled well by most algorithms (e.g. in Photoshop), unlike scaling up — again, taking Photoshop as an example, there are third party plugins that can handle scaling up much better than PS native algorithms.

If I come across some articles on the topic I’ll add links, for completeness sake.

Thank you tajmone for your quick response. You might ask why would anyone need lossless compression while scaling down. You see I do a lot of Stereo graphics, the information must be the same when scaled on either side as it is while it was larger. Otherwise your eye cannot figure out what it is looking at and it’s just a mess. If there could be a way to scale down yet as it’s losing information to lose the same information on either side it would help remarkably. if you find anything that can do this please don’t hesitate to post…Thanks in advance.

@Morintari,

You see I do a lot of Stereo graphics, the information must be the same when scaled on either side as it is while it was larger.

I hadn’t thought of stereo drawing — although I’m a stereoscopy fan since childhood, when I started drawing by green and red pencils anaglyph vignettes to view with colored lenses, and always collected View-Master 3D disks, stereoscopic cameras from the cold-war (Fed Stereo), and all sorts of gadgets, including having a 3D stereo TV (passive/polarized) and VR Headset.

Yes, this is a critical issue then, for I’ve experienced scaling-down problems when creating anagliphs in Photoshop, so my guess is that the same applies to other stereo technologies too (passive polarized lenses or flickering glasses).

Wether you’re using anagliphs or full-color images could make a big difference here, for in the former case colors should be preserved by the algorithm, while in the latter noise consistency might be more important.

Unfortunately I haven’t had a chance to experiment with PM NG stereo functionality because my 3D Tv is too far away from the PC, and I don’t know how to connect to it — I guess I’d need a laptop with a graphic card capable of full HD support, which I haven’t.

I think that shaders would offer the best solution here, for you’d be able to apply different filtering on the two image halves (usually they are split vertically for anagliphs, but 3D Tv also allow horizontal splitting). I’m not sure how game engines support stereoscopy, but usually they only offer full color, assuming that the game will be played on a passive Tv or flickering glasses (e.g. Invidia cards) — I’ve never tried the flickering glasses, because I’m at risk of epilepsy attacks, so I’ve passed on that one.

On consoles there are few stereo games (the Crysis series, Sniper Elite, and a few others) and all of them rely on 3D Tvs — but there have been a couple of games which also supported anaglyph rendering (can’t recall the title).

It’s a long time I haven’t visited the International Stereoscopic Union website, but I remember that they have a lot of links to developer tools for manipulating 3D images of all types, as well as articles and algorithms. I had downloaded from them some free tools for manipulating anaglyphs, stereograms, and other types of 3D images. So it might be worth digging into their website:

http://www.isu3d.org/

I’ll keep an eye open on this issue, for it interests me too. My bet is that if you don’t find a solution or article on the ISU website then you won’t find it anywhere else. But, as I said, it’s been a long time I haven’t visited it.

Another consideration that comes to mind regarding scalable 3D-stereo pixel art is noise filtering. My experience with digital images in stereoscopy is that you need to apply lot’s of noise filter to enhance 3D space, and avoid flat areas at all costs. I used to achieve this by either duotone patterns or noise generators.

Also, texturing is used to create the illusion of depth in 2D art as well, with grainy patterns seemingly being further away in distance than plain surfaces. Usually you’ll exploit this principle of design to enhance the 3D effect (together with other principles of colors and imagery perceptual illusions).

Now, if you’re working on pixel art games that need to scale up and down during the game (or adapt to the video medium by adapting in size) then it would be probably better to avoid noise in the original pixel art, and apply the noise by means of shaders — doing so, you’ll obtain consistent noise on both images, regardless of scaling (for you apply it after scaling, as a last pass), and you can also apply identical noise over both halves of the split images.

The only problem with this solution is that you’ll have a general noise pattern (which might mitigate the illusion of depth). A workaround would be to map certain colors to specific noises, so the shader can manipulate selectively the pixel art components (sprite, tiles, etc.). You could always adjust the original colors via the shader too.

Some engines (e.g. Game Maker Studio) allow applying shaders on a per-sprite bases, which would allow then to have different noise patterns overimposed depending on the sprite Z position (distant sprite being more grainy, nearest one less grainy). But this might be overkill even for the GPU, if you have lots of sprites, and might not work on mobile devices (or WebGL for that matter).

Whatever the solution (tests need to be done) the sure fact in all this is that shaders could guarantee that each half of the images is processed as a mirror of the other half, which is the key factor in preserving the illusion of depth on both eyes. My bet is that even if the scaling-down algorithm doesn’t preserve pixels consistently over the two image halves, consistent noise overlaying might recover the effect by compensating — just think of those stereo images where there’s also the non-stereo image in color, with the blue/red anaglyph images overimpressed; the anaglyphs add a layer of depth to the color image (usually shot through a third lense, between the righ/left lenses).

I was right about the noise filters! I’ve found this tutorial from the PFTrack user guide (a software for digital video processing in 3D):

Although this software operates on a whole other level, the principles are the same: apllying filters and noise is the key factor to preserve the stereo 3D illusion in both eyes. What the linked articles says about depth measuring of 3D objects (the software tracks it from motion) can also be applied to sprites if you know their Z coordinate and are able to pass it to shader.

tajmone thank you again for your quick reply. Although I am an artist and not a programmer I do appreciate your research in noise filtering. Unfortuanatly much of what you said just went over my head. You may be interested in the Virtual Boy vr32.de (Planet Virtual Boy)Nintendo’s first 3d game system. because you were interested in the View Master. Many people have criticism about it but to me it’s like a 3d laser show. Although the 3-D tools will support any 3-D pixel art 256 color application they were created for the Virtual Boy. I should know I created them, Jan programmed them and my friend RunnerPack helped with the specifics. IMHO they are the best for any pixel art application. Let me know when you get your 3-D monitor hooked up and you are drawing in 3-D so I can count you on the list of artists that draw with PMNG in 3-D. The list is growing and the more users we have the more tools we get. I’m sorry that you cannot use them at this time and I hope that I’m not rubbing it in. I’m sure you are aware of the crosseye method and it doesn’t work for you I assume. Although I am not a programmer I do have programmer friends and I am sure they will explain your techniques to me in a way I can understand. Thanks again for your research.

tajmone I may be missing something but…I understand how you can make a depth map to make a 2-D photo stereo or even shrink it. But is there a way just two shrink 2 stereo images, and have them still have correct depth without a depth map?

I had to follow your link and so some research, for I didn’t know of this Nintendo gadget from the 90’s — probably, it never made it to the Italian market. It’s difficult to imagine how it the 3D experience could be, since it uses a strange technology of a vibrating mirror to spread a beam of “pixels” vertically across the eyes. Also, the single red color is difficult to imagine. I guess it must be somewhat close to the View Master, but with motion.

My general experience with VR headsets is not positive though, and what I liked of the View Master was exactly that they were still images. Even in VR headets, I preferr games that don’t move at all (e.g. pinball tables), because motion tends to “normalize away” the 3D effect after a while — the same goes with 3D movies using polarized lenses. The fact is that seeing in 3D is the normal condition of mankind, so when we watch a 3D animation we only get that “woow” experience the first 10 minutes, then we focus on the story/game and stereoscopy goes in the background, almost unnoticed.

This is why good 3D animation movies tend to alternate scenes with little movement to sequences with rapid movements and great depth (jumps from buildings, canyons, etc.), because it king of awakes again the awareness that there is a 3D depth in the movie.

Static images/games, on the other hand (like the View Master) give you to chance to focus on the single scene and appreciate its 3 dimensions more (IMO).

As for the technical questions. No, you don’t need depth maps at all, just noise and grain that adds depth. Even in the PS4 VR games you’ll always see that kind of noise — like if there were millions of dust particles in the air. You notice that especially when there is a black scene, because the noise is like many little snowflakes slightly illumintated. That noise is constantly added to the scene to guide the 3D vision, because flat colors and surfaces tend to look as empty holes in stereoscopy.

I’ll give you a link to a few examples — a 3D (anaglyph) website I had created in the early 90s:

http://www.neurolinguistic.com/teknoboy/italy-01.htm

(you can also see my stereo camera, the “FED Stereo”, Ukrainian edition) and a photograph I shot with it, rendered in anaglyph via Photoshop — v4 probably!)

Bare in mind that these images are from the era of modem connection to the Internet, and graphics were kept very low due to slowness of the Internet, and that displays were usually 800x600. In fact, this page (full 3D) needs to be scaled to 60% to render properly, because today we have displays double that resolution:

http://www.neurolinguistic.com/teknoboy/italy-02.htm

But even if you scale the page in the browser, the 3D effect is preserved — thanks to the noise filters that I’ve added to both color channels. I can’t remember now if I had applied those noise filters on a per layer basis or on the merged final layer. I think that I would use different techniques depending on the image at hand.

The same rules applies to the Nintendo Virtual Boy, no doubt. On Wikipedia it mentions that VB vr32 didn’t use many monocular cues except parallax, which could be one of the main reasons it didn’t take of well:

If you look at the Mario tennis demo, it’s basically very bright thick lines on black baground, which is counterproductive. On the site you linked, modern games for this console tend to use more dithering techiniques to achieve depth — i.e. noise. Dithering is noise, and duotone patterns (like in manga comics) were also used a lot in anaglyph comics to prevent uniform backgrounds from showing up as “empty space”.

With the VB vr32 the big limit is the reduced colors set — 3 shades of red + black. So dithering is a necessity:

But dithering is also good because it preserves 3D depth illusion. I think that on a resolution like this (224 horizontal) it might be difficult to handle noise with just three shades; I mean algorithmically, so you’ll have to work on dithering manually — which doesn’t solve the problem of scaling images down though.

You could try to overlay diagonal lines that cross the “screen” to create an interference pattern that isn’t disrupting. A bit like those filters that emulate CRT monitors on modern computers, by adding horizontal scan lines (again, done via shaders). Diagonal lines (pixel perfect) that somehow affect the hues of the underlying pixel, should create a subtle effect that doesn’t break the image, but you can perceive the diagonal lines. These are like a single duotone noise used in comics — e.g. see this 3D comic:

you can see the duotone patterns on the car seat, and the diagonal lines in the background. Without these reference patterns, the image would look flat with the glasses on.

I hope this might help.

As for the 3D Tv, I don’t think I’ll ever be able to hook it up. It’s too big the screen, and I’d have to be sitting too far away to use it with a PC. The best I could do is put some images on a pen drive and view them with the image apps of the Tv. Regarding the cross-eyes techique, I actually like it — both for stereograms and stereo-photos like these:

https://www.stereoscopy.com/isu/gallery.html

(I can see them perfectly without getting fatigued at all).

tajmone, The biggest advantage of the Virtual boy isn’t the slim collection of titals that was created for it.The biggest advantage is the games that can and will be produced for it. And the vast amount of tools we have for that purpose.