Assessing the Ability of Simulated Laboratory Scenes to Predict the Image Quality Performance of HDR Captures (and Rendering) of Exterior Scenes Using Mobile Phone Cameras

With the advent of computational photography, most cellphones include High Dynamic Range (HDR) modes or “apps” that capture and render high contrast scenes in-camera using techniques such as multiple exposures and subsequent “addition” of those exposures to render a properly exposed image. The results from different cameras vary. Testing the image quality of different cameras involves field-testing under dynamic lighting conditions that may involve moving objects. Such testing often becomes a cumbersome and time-consuming task. It would be more efficient to conduct such testing in a controlled, laboratory environment. This study investigates the feasibility of such testing. Natural exterior scenes, at day and night, some of which include “motion”, were captured with a range of cellphone cameras using their native HDR modes. The luminance ratios of these scenes were accurately measured using various spectro-radiometers and luminance meters. Artificial scenes, which include characteristics of the natural exterior scenes and have similar luminance ratios, were created in a laboratory environment. These simulated scenes were captured using the same modes as the natural exterior scenes. A subjective image quality evaluation was conducted using some 20 observers to establish an observer preference scale separately for each scene. For each natural exterior scene, the correlation coefficients between its preference scale and the preference scale obtained for each laboratory scene were calculated, and the laboratory scene with the highest correlation was identified. It was determined that while it was difficult to accurately quantify the actual dynamic range of a natural exterior scene, especially at night, we could still simulate the luminance ratios of a wide range of natural exterior HDR scenes, from 266:1 to 15120:1, within a laboratory environment. Preliminary results of the subjective study indicated that reasonably good correlation (0.8 or higher on average) was obtained between the natural exterior and laboratory simulated scenes. However, such correlations were determined to be specific to the type of scene studied. The scope of this study needs to be narrowed. Another consideration, how moving objects in the scene would affect the results, needs further investigation. Introduction Dynamic range is calculated by the ratio of the maximum luminance value and the minimum luminance value of a scene as shown in equation 1 below. DRscene = !"#$ !"%& (1) Ymax = Maximum scene luminance....................................... Ymin = Minimum scene luminance Dynamic range is affected by many factors such as optics and the sensor performance [1]. Optical characteristics such as diffraction, aberrations, and stray light or flare can limit the scene luminance range. Additionally, dynamic range can also be limited by sensor’s saturation and noise. In this experiment the dynamic range was calculated by first measuring the scene luminance ratios of a scene, the ratio of the brightest highlight to the darkest shadow. For this study, a high dynamic range scene was defined as 8 EVs and a low dynamic range scene was defined as any scene below 7 EVs. This study explored possible advancements for the testing process of HDR devices by conducting experiments in a controlled environment. These advancements would progress future testing by simplifying the testing process, limiting variability, and improving the repeatability of same conditional testing. The study verified a way to reduce time and money of conducting preliminary testing of HDR capture devices. Testing HDR capture devices is often cumbersome and time consuming because it involves extensive field-testing and varying lighting conditions. The purpose of this study was to test a method of simulating natural exterior scene luminance ratios in a controlled laboratory setting. It sought to answer the question: how does the “HDR Tester” marketed by SensorSpace LLC. perform when predicting the capabilities of mobile phones to produce HDR renderings? If the “HDR Tester” performed well it would provide a consistent and repeatable method to test HDR devices in a laboratory scene, saving time and money that would normally be spent on field-testing. The purpose was to test the feasibility and capabilities of the HDR Tester. To find a method of testing the HDR capabilities of capture devices in a laboratory setting to reduce cost and time that it takes to test the devices outdoor.


Introduction
Dynamic range is calculated by the ratio of the maximum luminance value and the minimum luminance value of a scene as shown in equation 1 below.
Y max = Maximum scene luminance………………………………… Y min = Minimum scene luminance Dynamic range is affected by many factors such as optics and the sensor performance [1]. Optical characteristics such as diffraction, aberrations, and stray light or flare can limit the scene luminance range. Additionally, dynamic range can also be limited by sensor's saturation and noise. In this experiment the dynamic range was calculated by first measuring the scene luminance ratios of a scene, the ratio of the brightest highlight to the darkest shadow. For this study, a high dynamic range scene was defined as 8 EVs and a low dynamic range scene was defined as any scene below 7 EVs.
This study explored possible advancements for the testing process of HDR devices by conducting experiments in a controlled environment. These advancements would progress future testing by simplifying the testing process, limiting variability, and improving the repeatability of same conditional testing. The study verified a way to reduce time and money of conducting preliminary testing of HDR capture devices.
Testing HDR capture devices is often cumbersome and time consuming because it involves extensive field-testing and varying lighting conditions. The purpose of this study was to test a method of simulating natural exterior scene luminance ratios in a controlled laboratory setting. It sought to answer the question: how does the "HDR Tester" marketed by SensorSpace LLC. perform when predicting the capabilities of mobile phones to produce HDR renderings? If the "HDR Tester" performed well it would provide a consistent and repeatable method to test HDR devices in a laboratory scene, saving time and money that would normally be spent on field-testing.
The purpose was to test the feasibility and capabilities of the HDR Tester. To find a method of testing the HDR capabilities of capture devices in a laboratory setting to reduce cost and time that it takes to test the devices outdoor. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Experimental Approach
This study examined the challenges of testing the image quality of HDR capture (and associated rendering) in natural exterior scenes by simulating them in a laboratory environment.

Figure 1. Workflow for HDR image capture and evaluation
Step 1a -Capture natural exterior scenes and measure scene luminance ratio Step 1b -Simulate natural exterior scene luminance ratios in the laboratory and capture laboratory scenes Step 3 -Conduct psychometric paired comparisons test Step 4 -Analyze results to determine if observer preferences for the natural exterior and laboratory scenes are equivalent If HDR scenes simulated in a laboratory environment are to be a viable predictor for natural exterior HDR scenes, then observer ratings of image quality of a laboratory-generated scene (under some set of conditions) as captured by a variety of devices should agree with observer ratings of image quality of natural HDR scenes as captured by the same devices.
Images based off lighting conditions produced by the HDR Tester were compared with that of natural exterior scene luminance ratios to see if there was a correlation.
The white patch reflects 32 times as much light as the black patch when both are evenly lit. By measuring the white patch in the shadows and dividing by 32 the value for the 'black' patch calculated. Then the scene luminance ratio is found using the calculated black patch value and the measured value from the white patch in the highlights.
Very bright self-luminous objects in a scene generally carry no detail and are just bright blobs (sun, direct light sources such as in a night scene, headlights) so they are excluded, as well as specular highlights.

Methodology
Four natural exterior scenes were captured, during the day and night, with four different models of mobile phones using their inherent HDR modes. The luminance ratios of these scenes were measured using various spectro-radiometers and luminance meters. Measurements were taken, with each of the devices, of a white patch on a target placed in the shadows, of a white patch on a target placed in the highlights, of the darkest adopted shadows, and of the brightest adopted highlights in a natural exterior scene. Measurement maps were created These measurements were used to calculate the scene's luminance ratio. Artificial scenes, which included characteristics of the natural exterior scenes, were created in a laboratory environment. The luminance ratios of the natural exterior scenes were replicated in the laboratory using an "HDR Tester" from SensorSpace, LLC. The "HDR Tester" is divided into two compartments with individual lighting controls. The top compartment is used to simulate the highlights of the natural exterior scene and the bottom compartment is used to simulate the shadows of the natural exterior scene. These simulated scenes were captured and measured using the same methods as the natural exterior scenes. The luminance ratios of highlights to shadows of the natural exterior scenes were replicated in the laboratory using the "HDR Tester." A subjective image quality evaluation was conducted using observers to establish an observer preference scale separately for each scene. The study presented the observers with a side-by-side paired comparison, based on Thurstone's law of comparative judgement [2], where they chose either image A or image B based on which image they thought rendered the most detail in both the highlights and the shadows. The observers were asked to compare the images from the different mobile phone images for each individual scene. They were presented with both the natural exterior scenes as well as the laboratory scenes.
The correlation between the preference scales for each natural scene and the replicated laboratory scene was evaluated to determine if the lab results matched those of the natural exterior scenes. For each natural exterior scene, the correlation coefficients between its preference scale and the preference scale obtained for each laboratory scene were calculated, and the laboratory scene with the highest correlation was identified. The closer the correlation is to 1 the closer the match and the better the HDR Tester is at simulating how an HDR capture device would perform in the field. It was determined that while it was difficult to accurately quantify the actual dynamic range of a natural exterior scene, especially at night, we could still simulate the luminance ratios of a wide range of natural exterior HDR scenes, from 266:1 to 15120:1, within a laboratory environment. Results of the subjective study indicated that reasonably good correlation (0.8 or higher on average) was obtained between the natural exterior and laboratory simulated scenes. However, such correlations were determined to be specific to the type of scene studied.
Captured natural exterior scenes during the day and night and took luminance measurements of the highlights and shadows to calculate the scene luminance ratio. Replicate the natural exterior scene luminance ratios using the HDR Tester in the laboratory. Capture the laboratory scenes with the same capture devices.

Results (Data)
The HDR Tester from SensorSpace, LLC can replicate luminance ratios from 266:1 to 15120:1. The paired comparison test showed that correlations of 0.8 and higher are achievable between the natural exterior scenes and the simulated Laboratory scenes for certain conditions. Results for certain circumstances (like night scenes with a dominant light source) gave poor correlation. Preliminary testing shows that the HDR Tester can be used to simulate a wide variety of scenes and can make camera image quality testing quicker and more convenient for certain scenes. The team plans to conduct additional studies for a broader range of scenes with additional cameras and narrow the range of questions posed to observers in the subjective study to better quantify and narrow the results.

Conclusions
Results show that the "HDR Tester" can be used to simulate a wide variety of scenes and can make preliminary testing of HDR capture devices quicker and more convenient for certain scenes.
The "HDR Tester" can reproduce scene luminance ratios from 266:1 to 15120:1, a total of a 13-stop difference.
We concluded that the white balance of an image plays a major part in observer preference testing. The best correlations achieved using color images in the paired comparison test were the scenes with the most consistent color balance across all the devices. Scene 1 Daytime and its simulated laboratory counterpart, as shown in Figure 5, had the highest correlation because the white balance was very consistent for the natural exterior scene and the simulated laboratory scene. When the images were converted to monochrome good correlation was achieved between all the natural exterior scenes and their laboratory simulations. This shows that the observers were not able to ignore color balance issues when picking a preference during the paired comparison test, even when they were directed to do so.
It was discovered that the presence of a strong self-luminous object in the natural scene indicated that a similar object be present in the HDR tester. However, it was discovered that the luminance ratio of the best-matching HDR tester scene was not always closest to the luminance ratio of the natural scene.

Plans for Future Work
Plans for future work include conducting additional studies for a broader range of scenes with additional cameras and narrow the range of questions posed to observers in the subjective study to better quantify and narrow the results. Additionally, further studies will be directed towards the effects of in-scene motion and color error.
Research should be conducted to explore the effects of simulating a wider range of illuminants and color temperatures in the "HDR Tester." This study encountered issues with the mobile devices' ability to properly neutral balance. This skewed the data when the study was conducted using color images.
Additionally, it would be ideal if we were to be able to predict which settings (illumination ratio and self-luminous stimuli) in the "HDR Tester" will best predict the preferences for a natural HDR scene.
One problem HDR capture devices encounter is to consistently freeze motion over multiple exposures. If motion is not captured properly ghosting will occur in the image. This study did not explore motion as it applies to HDR capture, however, further research should be conducted where motion is addressed.