The Synthetic and Measured Paired and Labeled Experiment (SAMPLE) Dataset for SAR ATR Development


Developing automatic target recognition (ATR) algorithms for synthetic aperture radar (SAR) imagery is an important step toward effectively processing the amount of data created by SAR platforms. Allowing computers to efficiently extract the data from these images and return only relevant information dramatically accelerates the decision-making process. However, to effectively use popular machine learning algorithms for this task, a large quantity of training data is needed. Collecting and labeling data is prohibitively expensive, so obtaining the required quantity of data requires computer simulation. This, in turn, introduces assumptions to the dataset that must be properly addressed. We have developed the Synthetic and Measured Paired and Labeled Experiment (SAMPLE) dataset to aid research in training networks with synthetic data for better generalization to measured imagery. The key feature of this data is that the computer-aided design (CAD) models used during simulation are carefully matched to electro-optical imagery that was taken during the SAR data collection. This removes much of the variation between simulated and measured data and leaves researchers free to investigate the underlying difference between the simulated and measured domains.


In today’s world, large quantities of data are used to solve a number of problems. This data is often plentiful and inexpensive; high-resolution sensors and fast data links provide a constant stream of information for a variety of purposes. This has upended the balance between human processing power and available data present a few decades ago, creating an ever-increasing “pixelsto-eyeballs ratio.” Because of this, it has become even more necessary to develop computer vision and processing techniques to intelligently distill this information for human consumption anddecision making.

For imagery information, unifying decades of research in computer vision and extremely fast and parallelized computational resources has resulted in an effective toolset of machine-learning algorithms, such as convolutional neural networks [1] and recurrent neural networks [2]. These networks have driven fast advances in a host of fields; however, this requires a significant amount of data. Fundamentally, these networks fit a high-dimensional, nonlinear parametric function to the data. Without a sufficiently large and varied dataset on which to train, the training process will cause the function to overfit the data, resulting in poor generalization. In general, collecting and truthing data for training machine learning algorithms can be expensive.

For SAR, a sensor of interest in military and civilian applications, data collection for research is especially cost prohibitive. Collecting airborne SAR images involves flying a radar on an aircraft, which naturally costs much more than simply taking images of random objects with a camera. The cost of acquiring airborne SAR imagery is most likely a key reason that the current state-of-the-art SAR research dataset, the Moving and Stationary Target Acquisition and Recognition (MSTAR) [3] dataset, is over 20 years old. It can be reasonably assumed that new datasets for SAR data will not be forthcoming with great frequency.

In the absence of SAR data collected in the real world, a machine-learning solution to the SAR ATR problem requires using simulated SAR imagery, which forms by computing how a radar pulse interacts with a computer model of a target. Because simulations approximate the real world, an image of the same target and the same parameters in both domains will be slightly different; we term this the “synthetic/measurement gap.” However, careful attention to simulation parameters and the fidelity of computer models can help reduce this gap and drive productive research into creating an ATR that can generalize to measured data.

The SAMPLE dataset [4] was designed to foster investigation in minimizing the gap between simulated and measured SAR imagery.  While early research with this dataset has not conclusively solved this problem, we anticipate that access to this dataset by the wider defense community will accelerate research efforts.  A portion of the dataset has been cleared for public release, and the entire dataset is available to employees of U.S. government agencies and their contractors.  Data products include portable network graphics format images of the image magnitude and Matlab files with complex imagery data.

Due to space constraints, we present an overview of the dataset in this work and refer the reader to our published conference papers [4, 5] for an expanded view of the implementation details.  Here, we will discuss the philosophy and motivation of the dataset and discuss some of the research problems it was designed to address. The preparation of the dataset will be presented next, followed by a discussion on the fidelity of the dataset.  We then list a few research areas in which the dataset has been applied and present plans to expand the dataset and conclusions.


As the cost of computation has decreased, it has become more feasible to use asymptotic methods in electromagnetic computational software to simulate the interactions of a radar pulse with a computer model. The SAMPLE dataset leverages this inexpensive computation to add  a synthetic imagery extension to a portion of the MSTAR dataset. To create the synthetic imagery, we based the simulated SAR data on high-fidelity computer models of vehicular targets from the MSTAR dataset. These models were initially created during the MSTAR program; we added value by correcting errors, fixing surface normals, and leveraging modern standards and file formats. While models of more targets were available, several models were rejected due to lack of complexity or major missing parts. The remaining usable vehicle models are listed in Table 1.

Our primary goal in creating the dataset was to minimize the difference between the two realms of data.  This enables investigating the gap in fidelity between the two domains that affects the real-world performance of ATR algorithms trained using simulated data.  This gap is manifested in various ways, all which are products of the assumptions made when creating synthetic data.  For example, the ground plane in simulated imagery is assumed to be flat, with a statistically rough surface, and empty of objects.  This does not match the real-world conditions where the ground consists of varying soil types and accompanying dielectric constants, exhibits elevation changes, and features rocks and plants. Simulation fidelity also suffers when using asymptotic electromagnetic simulation methods instead of rigorous but computationally impractical full-wave electromagnetic simulations. Furthermore, the simulated data is created using computer models of targets. These computer models, which may not perfectly match actual target geometry, are idealized by design. This, again, does not reflect properties of real targets, such as manufacturing variations, dents, or the presence of dirt. In order to overcome these differences, an ATR algorithm must correctly identify relevant features of the target (such as shape or pixel intensity) while ignoring imperfections, which is a challenging task.

Despite the inherent differences between simulated and measured data, there are many aspects that can be controlled. In particular, we focused on removing the differences in target articulation when creating this dataset. We also carefully minimized image differences that are a function of data collection and image formation, such as the data collection parameters, pixel spacing, and image formation algorithm, by replicating the parameters used in the MSTAR collect when forming the synthetic images.

Because the appearance of objects in SAR images is highly correlated to the relative positions of all surfaces (e.g., vehicle doors and hatches), we made great efforts to articulate these models to match their position during the MSTAR collect. We used data about one instance (the serial number shown in Table 1) of each vehicle as the ground truth. Sources for this positional information included photographic documentation, such as the images shown in Figure 1, and textual information from the MSTAR program reports. An iterative process was used to closely align the model positioning with this truth information—a time-consuming task. Due to the small wavelength (~3 cm) of radar frequencies, it was necessary to check the position of surfaces at these sizes, such as equipment and small hatches, in order to create an electromagnetic return consistent with the measured data.


The SAMPLE dataset exhibits good qualitative fidelity relative to the measured data. A visual inspection of randomly selected, measured images (shown in the top row of Figure 2) and corresponding synthetic images (bottom row) shows that the position, orientation, and amplitude of the vehicles in these images agree. While there are obvious discrepancies in the background, we presume that a successful approach to solving the synthetic/measurement gap problem will compensate. In any case, the nontarget area of an SAR image does not necessarily have any particular property or pattern. We believe that ignoring background information will help solve this problem.

To assess the dataset’s fidelity from a neural network point of view, we applied the t-distributed Stochastic Neighbor Embedding (t-SNE) [6] visualization technique to the dataset.  In creating this representation, we trained a DenseNet [7] neural network on the measured images, then removed the last layer. Feature vectors for all images in the dataset were computed by evaluating each image using the trained network. The feature vectors were then presented to the t-SNE algorithm, which embeds high-dimensional points in a low-dimensional (two in this case) space. This transformation creates a probability space in which points proximal in high-dimensional space have a high probability of being close together in the representation space. The t-SNE algorithm does not have any notion of class type during its execution. Because of this, points from the same class are only represented near each other if their feature vectors are also close in Euclidean distance. Finally, we reassigned labels and data types to each point to produce the plots shown in Figure 3.

Because the feature vectors are based on a network trained on measured data, it is understandable that the representation of the measured data in Figure 3a is more clustered by class than the synthetic data in Figure 3b.  This clustering is a good proxy for how well a classifier will perform.  While the clustering for the synthetic data is less clearly defined, the joint graph (Figure 3c)  shows that most instances of each vehicle—in both domains—cluster in the same two-dimensional space, with some exceptions.  However, it appears that the measured and synthetic portions for each class, while adjacent, are somewhat disjointed.  Nevertheless, this is a promising result, suggesting that it is possible to transfer information between the domains in a way that both sets of data can be separated by a network.

This separability is not so easily teased out by a neural network, however, which leads us to the current problem.  Neural networks, such as DenseNet [7], easily classify MSTAR imagery when trained on data at one elevation and tested on a similar elevation.  The average 10-class accuracy, shown in Figure 4a, hovers at a near-perfect level.  However, a network trained completely on our synthetic data and tested on measured data suffers a dramatic performance hit, as in Figure 4b.  Research is ongoing to bridge this gap.


The SAMPLE dataset has been used for basic research in a number of publications since its inception. These papers showcase some of our efforts to solve the problem of using synthetic data to train a generalizable machine-learning algorithm for ATR. Some of these approaches include using generative adversarial networks [8] to make the synthetic data look more realistic [9, 10], using image preprocessing techniques to reduce the variation between the image domains [11], using transfer learning approaches to blend the two datasets [12], and using Siamese networks to learn information about both domains [13]. While none of these approaches have completely solved this issue, they collectively indicate possible successful approaches to this problem.

In a broader context, we defined a set of challenge problems that we hope the dataset will address (see Lewis et al. [4]). These challenge problems include (a) training an algorithm entirely with synthetic data to completely generalize to measured data, (b) training with a very limited amount of data from each of the 10 classes, and (c) training with measured data from a subset of the classes.  While challenge problem (a) is the most difficult and most rewarding problem of the set, problems (b) and (c) prove interesting as well and encourage the use of existing measured data in conjunction with the ability to create large amounts of simulated data.  We have also set forth a basic machine-learning approach to these challenge problems [11].

Beyond machine-learning applications, the synthetic portion of the dataset may also serve as a second standalone dataset to complement MSTAR. Many techniques for classification [14], feature extraction [15], image enhancement [16], and image segmentation [17] have been developed over the years and validated using the MSTAR dataset. In future research, such techniques may use the synthetic imagery as a validation set.


While SAR is an excellent all-weather sensor, additional information from other sensor modalities may also be useful. The SAMPLE dataset does not represent the final state of our dataset creation efforts, especially given the availability of open-source tools such as Blender [18] to create high-fidelity simulated camera imagery from the models we already have. This expansion to another sensor will foster research efforts in multisensor target classification and data fusion. We do not plan to limit this dataset solely to MSTAR imagery if other appropriate data sources can be found.

Unfortunately, real-world electrooptical (EO) imagery of the MSTAR targets is unavailable, except for the small number of truthing images used to determine the appropriate target articulations.

Extensions to the dataset for the MSTAR targets will be limited to synthetically-generated camera imagery, which still has utility.  For example, experiments may leverage synthetic EO data and SAR data to train an ATR algorithm, which can then be tested on a held-out set of EO data and measured SAR imagery.

We also hope to identify other sources of measured SAR data with a rich set of accompanying EO data.  Augmenting such a dataset with simulated EO and SAR data would be ideal to further study multitarget classification using synthetic data.  Because truthing the CAD models is so time intensive, it would also be interesting to reduce the truthing fidelity to study how much the target articulations must match in order to produce good results using techniques developed with the SAMPLE dataset.  Other interesting properties of this type of expansion include imaging resolution, image formation algorithm, new targets, and more challenging environments for the targets.

Aside from expanding the dataset, our work in using machine learning to bridge the gap between synthetic and measured data will continue, with new work building on many of the ideas mentioned in Section 4. Ideas in this direction include leveraging adversarial network attacks to increase network robustness, investigating the inherent interclass differences between target classes, mixing hand-designed descriptors and machine learning, and using neural networks to leverage more information (such as phase).


We have presented a brief overview of the SAMPLE dataset as a supplement to the implementation details presented in earlier papers [4, 5]. Currently, this dataset consists of measured SAR imagery from the MSTAR dataset and synthetic imagery designed to match these images in image formation parameters and target articulation.   By studying the remaining differences between the two sets of data, we anticipate that researchers will be able to discover ways to train an ATR system on synthetic data that can generalize to measured data.

  1. Krizhevsky, A., I. Sutskever, and G. E. Hinton. “Ima-geNet Classification With Deep Convolutional Neural Networks.” In Advances in Neural Information Processing Systems, pp. 1097–1105, 2012.
  2. Hochreiter, S., and J. Schmidhuber. “Long Short-Term Memory.” Neural Computation, 1997.
  3. Sandia National Laboratory. “MSTAR Overview.”, accessed 19 May 2017.
  4. Lewis, B., T. Scarnati, E. Sudkamp, J. Nehrbass, S. Rosencrantz, and E. Zelnio. “A SAR Dataset for ATR De-velopment: The Synthetic and Measured Paired Labeled Experiment (SAMPLE).” In SPIE Algorithms for Synthetic Aperture Radar, Baltimore, MD, 2019.
  5. Lewis, B., J. Nehrbass, E. Sudkamp, S. Rosencrantz, and E. Zelnio. “A Deep Dive Into SAMPLE: The Synthetic and Measured Paired Labeled Experiment Dataset.” In MSS Tri-Service Radar Symposium, Orlando, FL, 2019.
  6. Van Der Maaten, L., and G. Hinton. “Visualizing High-Dimensional Data Using t-SNE.” Journal of Machine Learning Research, pp. 2579–2605, 9 November 2008.
  7. Huang, G., Z. Liu, L. Van Der Maaten, and K. Wein-berger. “Densely Connected Convolutional Networks.” In Proceedings of 30th IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, 2017.
  8. Goodfellow, I., J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. “Gener-ative Adversarial Nets.” In Advances in Neural Information Processing Systems, pp. 2672–2680, 2014.
  9. Lewis, B., J. Liu, and A. Wong. “Generative Adversarial Networks for SAR Image Realism.” In SPIE Algorithms for Synthetic Aperture Radar, p. 10, Orlando, FL, 2018.
  10. Lewis, B., O. DeGuchy, J. Sebastian, and J. Kaminski. “Realistic SAR Data Augmentation Using Machine-Learn-ing Techniques.” In SPIE Algorithms for Synthetic Aperture Radar, Baltimore, MD, 2019.
  11. Scarnati, T., and B. Lewis. “A Deep Learning Ap-proach to the Synthetic and Measured Paired Labeled Experiment (SAMPLE) Challenge Problem.” In SPIE Algo-rithms for Synthetic Aperture Radar, Baltimore, MD, 2019.
  12. Arnold, J., L. Moore, and E. Zelnio. “Blending Synthetic and Measured Data Using Transfer Learning for Synthetic Aperture Radar (SAR) Target Classification.” In SPIE Algorithms for Synthetic Aperture Radar, Orlando, FL, 2018.
  13. Friedlander, R., M. Levy, E. Sudkamp, and E. Zelnio. “Deep Learning Model-Based Algorithm for SAR ATR.” In SPIE Algorithms for Synthetic Aperture Radar, Orlando, FL, 2018.
  14. Chen, S., H. Wang, F. Xu, and Y.-Q. Jin. “Target Clas-sification Using the Deep Convolutional Networks for SAR Images.” IEEE Transactions on Geoscience and Remote Sensing, 2016.
  15. Cui, J., J. Gudnason, and M. Brookes. “Automatic Recognition of MSTAR Targets using Radar Shadow and Superresolution Features.” In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing – Proceedings, 2005.
  16. Owirka, G., S. Verbout, and L. Novak. “Template-Based SAR ATR Performance Using Different Image Enhancement Techniques.” In SPIE 1999 Algorithms for Synthetic Aperture Radar, pp. 302–320, 1999.
  17. Huang, S., W. Huang, and T. Zhang. “A New SAR Image Segmentation Algorithm for the Detection of Target and Shadow Regions.” Scientific Reports, 2016.
  18. Blender Online Community. “Blender – a 3D Modelling and Rendering Package,” 2019.

Focus Areas

Want to find out more about this topic?

Request a FREE Technical Inquiry!