Skip to content

Use VolatileImage for back-buffers to allow hardware-accelerated rendering #155

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 23 commits into
base: master
Choose a base branch
from

Conversation

hfmkwi
Copy link

@hfmkwi hfmkwi commented May 6, 2025

Summary

This partially fixes Ticket #20366 wherein rendering is abysmally slow, especially on HiDPI systems. Current behavior uses a double-buffer strategy with BufferedImages, forcing virtually all rendering into software loops.1 The bespoke HiDPI buffer scaling with AffineTransforms compounds this performance degradation.2 Performance seems to suffer exponentially with each additional layer regardless of its data; imagery tile layers seem to contribute the most.3

I’ve switched the buffers to VolatileImages, so the renderer can now defer to hardware routines where possible. This also defers HiDPI-scaling to the hardware surface, eliminating the need for the existing software one. The performance improvement is most obvious when running JOSM with a hardware-accelerated pipeline and

  1. panning an non-trivial OSM data layer with labels disabled;
  2. panning an arbitrary number of enabled imagery layers;
  3. working with area fill enabled; or
  4. working at high zoom levels.

This cannot convert some inherently software-based render operations like

  • Graphics2D canvas/text anti-aliasing;
  • GlyphVector-based text rendering; and
  • alpha compositing (and whatever else I couldn’t find).

These ought to be refactored to take advantage of the equivalent hardware routines, if possible. If not, then the user should have finer control over these (e.g., rendering hints).

Miscellanea

This also fixes a couple of performance regressions:

  • MapView was marked as a transparent component; and
  • hovering over an OSM primitive would immediately invalidate the layer, even if inactive

Footnotes

  1. See Oracle docs.

  2. Incidentally, this is probably why the recommendation to run with -Dsun.java2d.uiScale=1 was a suggested solution, since the resulting AffineTransform would’ve been the identity matrix.

  3. Stacking imagery layers very quickly destroys the frame rate.

hfmkwi added 18 commits May 5, 2025 08:11
Assuming that JOSM draws *nothing else* underneath MapView, this
should be safe. Reasoning: the unchanged layers buffer looks to be
the back-back buffer that JOSM overlays the actual back buffer onto
before dumping all that onto MapView’s canvas.
Since we can create a BufferedImage without a valid GraphicsConfiguration (i.e.,
it is null), we can re-use the headless conditional to pass a dummy buffer
during component instantiation. That conditional also short-circuits the null
check.
Pre-commit, the map renderer uses transient BufferedImages, an image type that
notoriously executes drawing entirely in software loops (you may verify this
behavior by running with `-Dsun.java2d.trace=count` and either `-Dsun.java2d
.opengl=true` or `-Dsun.java2d.d3d=true`). Unfortunately, this means that for
the last decade or so, JOSM has had no support for hardware‑accelerated
rendering (see ticket #10101). Rendering performance is worsened by using said
BufferedImages for bespoke double-buffering instead of Swing’s native
double-buffering or BufferStrategy/VolatileImage. This chronic performance
degradation is unacceptable: a mid-range tower from the mid-2010s should not
struggle to render a **single layer of Bing tiles**.

VolatileImages do not have this problem as acceleratable–hardware surfaces.
Switching to these also obsolesces the UI-scaling affine transforms (presumably
because hardware surface takes UI scaling into account? it just works,
apparently 😕).
SVN 19176 introduced a performance regression in the OSM data layer where
hovering over a primitive triggers a layer invalidation and tile cache reset,
all presumably for the tiled renderer. When working with even modest datasets,
this triggers superfluous repaints of the MapView. The only active renderer
check is in `resetTiles` (SVN 19271)—and this fails to guard against any other
execution specific to the tiled renderer.
Previous behavior has the render code invalidate the back buffers on *all*
component resizes. We can avoid this by creating this buffer exactly once during
initialization with the maximum possible virtual desktop boundary. Render code
now only needs to clip the buffer accordingly to component bounds.

Also, paint code runs when headless (?). I’m uncertain of if this necessary; a
quick look at tests indicates no, but I’ve added some extra guards where
applicable.
@@ -101,6 +103,9 @@ public class MapView extends NavigatableComponent
implements PropertyChangeListener, PreferenceChangedListener,
LayerManager.LayerChangeListener, MainLayerManager.ActiveLayerChangeListener {

private static final boolean IS_HEADLESS = GraphicsEnvironment.isHeadless();
private static final Rectangle VIRTUAL_DESKTOP_BOUNDS = getVirtualDesktopBounds();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The virtual desktop is not static. It can change (i.e. add another monitor).

Copy link
Author

@hfmkwi hfmkwi May 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an issue I thought about for a while, since there's no non-trivial way of solving it.1 There’s no public interface for listening to complete GraphicsEnvironment/GraphicsDevice changes.2

I could have the virtual desktop dimensions recalculated on
every GraphicsConfiguration change. This might work if

  • GraphicsEnvironment.getScreenDevices() always returns a live view of current screen devices; and
  • stale screen devices are immediately evicted alongside their GraphicsConfigurationss.

These were my initial assumption—but I can’t find any supporting documentation. In testing, the component GC is invalidated when it’s moved between screen devices. I think this is reasonable enough evidence for my assumption. For now, I can apply that solution, since it seems the best compromise.

Footnotes

  1. At least, there’s no tidy solution I’ve found yet myself.

  2. I could use reflection to get the internal one, but then I’d be dealing with implementation details that would unnecessarily complicate this. I also can’t reliably test for these, and I doubt these are guaranteed compatibility with an arbitrary Java version.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assumptions sound sane. Go for it and we'll hope it works fine. While I appreciate the improvement I fear the issues which changes to such a central component can bring. Especially the multiscreen stuff was very fragile a long time.

Copy link
Author

@hfmkwi hfmkwi May 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get the concerns: I’ve approaching this with multi-head setups in mind since I’ve been testing on one. There are some hardware pipeline idiosyncracies that might affect render speeds depending on which screen device the application resides,1 but those seem platform- and pipeline-specific.2 Generally, this setup should be correct for both single- and multi-head setups and for HiDPI setups.3

Footnotes

  1. On my machine, at least, the D3D pipeline runs on all screens and VolatileImages are compatible only with their progenitor screen device, whereas the OGL one runs on display 0 only and the VIs appear to be resistant to that.

  2. Determining the actual renderer requires either a string comparison with screen device IDs or reflection to expose package-privage implementation details of the JDK

  3. Internally, SunVolatileImage appears to use a GC-compatible BufferedImage if something goes wrong with the hardware pipeline, falling back to the same setup as current. Also, this means null-checking the return value of createCompatibleVolatileImage is optional except in headless setups, but then MapView will skip creating those buffers and painting entirely.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VIRTUAL_DESKTOP_BOUNDS indicates it is static and unchangeable (constant). I think virtualDesktopBounds is better.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One checkstyle:
src/org/openstreetmap/josm/gui/MapView.java:369:43: Missing space after ','

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I did a first test on my non-HIDPI screen using a larger data set and nothing else and simply move it around. With your patch it feels significantly slower.

Copy link
Author

@hfmkwi hfmkwi May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I did a first test on my non-HIDPI screen using a larger data set and nothing else and simply move it around. With your patch it feels significantly slower.

What render settings were your running with? Pipeline, anti-aliasing? Any sort of AA, text or canvas, is done entirely in software, and the JDK docs caution against mixing hardware- and software-based rendering to avoid performance degradation.1 Maybe I should have the AA preferences in the GUI warn about this? As for the pipeline, assuming you’re on a Windows machine, using anything other than the DirectDraw/GDI one should significantly yield better frame times.2

Footnotes

  1. In testing, enabling AA doubled render times when on the hardware pipeline.

  2. Strangely, in current JOSM, disabling DD with -Dsun.java2d.nodraw=true yields perceptually better render times.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these changes are sufficiently intrusive, I could also move this code to an entirely separate renderer that overrides paint methods as applicable.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used default settings (Linux openSUSE), no special setup.

I think the first thing we need is proper profiling code which also outputs the relevant parameters (pre-patch and post-patch), so that we can compare number instead of feelings. We can then use a screen recorder and playback to compare identical situations.

Than we can check where the code improves situation and where it makes it worse. Based on that we can either fix introduced bugs or decide to make different code for different setups.

@hfmkwi hfmkwi marked this pull request as draft May 8, 2025 01:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants