extract building footprints from satellite images using deep learning

Another piece of good news for those dealing with geospatial data is that Azure already offers a Geo Artificial Intelligence Data Science Virtual Machine (Geo-DSVM), equipped with ESRI’s ArcGIS Pro Geographic Information System. In OpenStreetMap there are currently 30,567,953 building footprints in the US (at last count) both from editor contributions and various city or county wide imports. In this paper, we propose a novel deep neural network, which enables to jointly detect building instance and regularize noisy building boundary shapes from a single satellite imagery. For those eager to get started, you can head over to our repo on GitHub to read about the dataset, storage options and instructions on running the code or modifying it for your own dataset. The count of true positive detections in orange is based on the area of the ground truth polygon to which the proposed polygon was matched. Generate a Classified Raster using Classify Pixels Using Deep Learning tool. The sample code contains a walkthrough of carrying out the training and evaluation pipeline on a DLVM. With the sample project that accompanies this blog post, we walk you through how to train such a model on an Azure Deep Learning Virtual Machine (DLVM). The labels are released as polygon shapes defined using well-known text (WKT), a markup language for representing vector geometry objects on maps. How to extract building footprints from satellite images using deep learning Applying machine learning to geospatial data. We can see that towards the left of the histogram where small buildings are represented, the bars for true positive proposals in orange are much taller in the bottom plot. Another piece of good news for those dealing with geospatial data is that Azure already offers a Geo Artificial Intelligence Data Science Virtual Machine (Geo-DSVM), equipped with ESRI’s ArcGIS Pro Geographic Information System. Finally, if your organization is working on solutions to address environmental challenges using data and machine learning, we encourage you to apply for an AI for Earth grant so that you can be better supported in leveraging Azure resources and become a part of this purposeful community. In computer vision, the task of masking out pixels belonging to … (Watch for more models in the future!). In addition, 76.9 percent of all pixels in the training data are background, 15.8 percent are interior of buildings and 7.3 percent are border pixels. An example of infusing geospatial data and AI into applications that we use every day is using satellite images to add street map annotations of buildings. Since clouds can cover a much larger part of the imagery, it will be interesting to see how accurately the AI can reconstruct those areas. In this post, we highlight a sample project of using Azure infrastructure for training a deep learning model to gain insight from geospatial data. We also created a tutorial on how to use the Geo-DSVM for training deep learning models and integrating them with ArcGIS Pro to help you get started. The semantic segmentation model (a U-Net implemented in PyTorch, different from what the Bing team used) we are training can be used for other tasks in analyzing satellite, aerial or drone imagery – you can use the same method to extract roads from satellite imagery, infer land use and monitor sustainable farming practices, as well as for applications in a wide range of domains such as locating lungs in CT scans for lung disease prediction and evaluating a street scene. Save the model. The top histogram is for weights in ratio 1:1:1 in the loss function for background : building interior : building boundary; the bottom histogram is for weights in ratio 1:8:1. In computer vision, the task of masking out pixels belonging to different classes of objects such as background or people is referred to as semantic segmentation. They use AI to create/recreate areas in the source satellite images that are hidden behind clouds. The following segmentation results are produced by the model at various epochs during training for the input image and label pair shown above. Building Footprint Extraction model is used to extract building footprints from high resolution satellite imagery. We chose a learning rate of 0.0005 for the Adam optimizer (default settings for other parameters) and a batch size of 10 chips, which worked reasonably well. We use labeled data made available by the SpaceNet initiative to demonstrate how you can extract information from visual environmental data using deep learning. Zoom to an area of interest. In this post, we highlight a sample project of using Azure infrastructure for training a deep learning model to gain insight from geospatial data. It was found that giving more weights to interior of building helps the model detect significantly more small buildings (result see figure below). This is a critical task in damage claim processing, and using deep learning can speed up the process and make it more efficient. 2. Thanks for the info. Accurate building footprints extracted from high resolution satellite imagery are becoming available from companies such as Ecopia, which has just announced a partnership with DigitalGlobe… We use labeled data made available by the SpaceNet initiative to demonstrate how you can extract information from visual environmental data using deep learning. These are transformed to 2D labels of the same dimension as the input images, where each pixel is labeled as one of background, boundary of building or interior of building. Load an UnetClassifier model. Aerial photos and high-resolution satellite images are extensively used in … The geospatial data and machine learning communities have joined effort on this front, publishing several datasets such as Functional Map of the World (fMoW) and the xView Dataset for people to create computer vision solutions on overhead imagery. There are a number of parameters for the training process, the model architecture and the polygonization step that you can tune. While its designed for the contiguous United States, … The techniques here can be applied in many different situations and we hope this concrete example serves as a guide to tackling your specific problem. After epoch 7, the network has learnt that building pixels are enclosed by border pixels, separating them from road pixels. Since this is a reasonably small percentage of the data, we did not exclude or resample images. A powerful, low-code platform for building apps quickly, Get the SDKs and command-line tools you need, Continuously build, test, release, and monitor your mobile and desktop apps. The weight for the three classes (background, boundary of building, interior of building) in computing the total loss during training is another parameter to experiment with. In this post, we highlight a sample project of using Azure infrastructure for training a deep learning model to gain insight from geospatial data. I would like thank Victor Liang, Software Engineer at Microsoft, who worked on the original version of this project with me as part of the coursework for Stanford’s CS231n in Spring 2018, and Wee Hyong Tok, Principal Data Scientist Manager at Microsoft for his help in drafting this blog post. In June 2018, our colleagues at Bing announced the release of 124 million building footprints in the United States in support of the Open Street Map project, an open data initiative that powers many location based services and applications. Make sure you have downloaded the Model and Added the Imagery Layer in ArcGIS Pro. A final step is to produce the polygons by assigning all pixels predicted to be building boundary as background to isolate blobs of building pixels. Today, subject matter experts working on geospatial data go through such collections manually with the assistance of traditional software, performing tasks such as locating, counting and outlining objects of interest to obtain measurements and trends. The Bing team was able to create so many building footprints from satellite images by training and applying a deep neural network model that classifies each pixel as building or non-building. We show how to carry out the procedure on an Azure Deep Learning Virtual Machine (DLVM), which are GPU-enabled and have all major frameworks pre-installed so you can start model … Each plot in the figure is a histogram of building polygons in the validation set by area, from 300 square pixels to 6000. After epoch 7, the network has learnt that building pixels are enclosed by border pixels, separating them from road pixels. The Bing team was able to create so many building footprints from satellite images by training and applying a deep neural network model that classifies each pixel as building or non-building. As part of the AI for Earth team, I work with our partners and other researchers inside Microsoft to develop new ways to use machine learning and other AI approaches to solve global environmental challenges. In the sample code we make use of the Vegas subset, consisting of 3854 images of size 650 x 650 squared pixels. We used Classify pixels using deep learning tool to segment the imagery using the model and post-processed the resulting raster in ArcGIS Pro to extract building footprints… Blobs of connected building pixels are then described in polygon format, subject to a minimum polygon area threshold, a parameter you can tune to reduce false positive proposals. Three deep learning models are now available in ArcGIS Online. These models are available as deep learning packages (DLPKs) that can be used with ArcGIS Pro, Image Server and ArcGIS API for Python. We chose a learning rate of 0.0005 for the Adam optimizer (default settings for other parameters) and a batch size of 10 chips, which worked reasonably well. Deploy Model and Extract Footprints. Preview Results. We can see that towards the left of the histogram where small buildings are represented, the bars for true positive proposals in orange are much taller in the bottom plot. The topic of this blog is a ready-to-use deep learning model to extract building footprints (i.e. Each plot in the figure is a histogram of building polygons in the validation set by area, from 300 square pixels to 6000. Original images are cropped into nine smaller chips with some overlap using utility functions provided by SpaceNet (details in our repo). The optimum threshold is about 200 squared pixels. It was found that giving more weights to interior of building helps the model detect significantly more small buildings (result see figure below). satellite imagery. The weight for the three classes (background, boundary of building, interior of building) in computing the total loss during training is another parameter to experiment with. Some chips are partially or completely empty like the examples below, which is an artifact of the original satellite images and the model should be robust enough to not propose building footprints on empty regions. https://azure.microsoft.com/blog/how-to-extract-building-footprints-from-satellite-images-using-deep-learning/, Kickstart your artificial intelligence/machine learning journey with the Healthcare Blueprint, Pioneers in AI – Conversations with AI Thought Leaders, Microsoft named a Leader in Gartner’s 2020 Magic Quadrant for Cloud DBMS Platforms, Digital event: Explore how data and analytics will impact the future of your business, Azure Cost Management and Billing updates – November 2020, Achieving 100 percent renewable energy with 24/7 monitoring in Microsoft Sweden. In computer vision, the task of masking out pixels belonging to different classes of objects such as background or people is referred to as semantic segmentation. We also created a tutorial on how to use the Geo-DSVM for training deep learning models and integrating them with ArcGIS Pro to help you get started. There are a number of parameters for the training process, the model architecture and the polygonization step that you can tune. How to extract complex building footprints using Machine Learning and Deep Learning approaches? To extract building footprints from the Imagery, follow these steps: 1. These are transformed to 2D labels of the same dimension as the input images, where each pixel is labeled as one of background, boundary of building or interior of building. We observe that initially the network learns to identify edges of building blocks and buildings with red roofs (different from the color of roads), followed by buildings of all roof colors after epoch 5. “We wanted to use machine learning to extract street data and building footprints from the satellite imagery while using the minimum amount of human input.” Deep Learning to the Rescue Deep learning, a powerful form of AI, involves teaching a computer to detect patterns in large amounts of data, and to recognize and extract just the information you want. The top histogram is for weights in ratio 1:1:1 in the loss function for background : building interior : building boundary; the bottom histogram is for weights in ratio 1:8:1. Depending on the type of data employed for building extraction the existing methods can be divided into two main groups: using aerial or high-resolution satellite imagery and using three-dimensional (3D) information. With the sample project that accompanies this blog post, we walk you through how to train such a model on an Azure Deep Learning Virtual Machine (DLVM). Bring Azure services and management to any infrastructure, Put cloud-native SIEM and intelligent security analytics to work to help protect your enterprise, Build and run innovative hybrid applications across cloud boundaries, Unify security management and enable advanced threat protection across hybrid cloud workloads, Dedicated private network fiber connections to Azure, Synchronize on-premises directories and enable single sign-on, Extend cloud intelligence and analytics to edge devices, Manage user identities and access to protect against advanced threats across devices, data, apps, and infrastructure, Azure Active Directory External Identities, Consumer identity and access management in the cloud, Join Azure virtual machines to a domain without domain controllers, Better protect your sensitive information—anytime, anywhere, Seamlessly integrate on-premises and cloud-based applications, data, and processes across your enterprise, Connect across private and public cloud environments, Publish APIs to developers, partners, and employees securely and at scale, Get reliable event delivery at massive scale, Bring IoT to any device and any platform, without changing your infrastructure, Connect, monitor and manage billions of IoT assets, Create fully customizable solutions with templates for common IoT scenarios, Securely connect MCU-powered devices from the silicon to the cloud, Build next-generation IoT spatial intelligence solutions, Explore and analyze time-series data from IoT devices, Making embedded IoT development and connectivity easy, Bring AI to everyone with an end-to-end, scalable, trusted platform with experimentation and model management, Simplify, automate, and optimize the management and compliance of your cloud resources, Build, manage, and monitor all Azure products in a single, unified console, Stay connected to your Azure resources—anytime, anywhere, Streamline Azure administration with a browser-based shell, Your personalized Azure best practices recommendation engine, Simplify data protection and protect against ransomware, Manage your cloud spending with confidence, Implement corporate governance and standards at scale for Azure resources, Keep your business running with built-in disaster recovery service, Deliver high-quality video content anywhere, any time, and on any device, Build intelligent video-based applications using the AI of your choice, Encode, store, and stream video and audio at scale, A single player for all your playback needs, Deliver content to virtually all devices with scale to meet business needs, Securely deliver content using AES, PlayReady, Widevine, and Fairplay, Ensure secure, reliable content delivery with broad global reach, Simplify and accelerate your migration to the cloud with guidance, tools, and resources, Easily discover, assess, right-size, and migrate your on-premises VMs to Azure, Appliances and solutions for data transfer to Azure and edge compute, Blend your physical and digital worlds to create immersive, collaborative experiences, Create multi-user, spatially aware mixed reality experiences, Render high-quality, interactive 3D content, and stream it to your devices in real time, Build computer vision and speech models using a developer kit with advanced AI sensors, Build and deploy cross-platform and native apps for any mobile device, Send push notifications to any platform from any back end, Simple and secure location APIs provide geospatial context to data, Build rich communication experiences with the same secure platform used by Microsoft Teams, Connect cloud and on-premises infrastructure and services to provide your customers and users the best possible experience, Provision private networks, optionally connect to on-premises datacenters, Deliver high availability and network performance to your applications, Build secure, scalable, and highly available web front ends in Azure, Establish secure, cross-premises connectivity, Protect your applications from Distributed Denial of Service (DDoS) attacks, Satellite ground station and scheduling service connected to Azure for fast downlinking of data, Protect your enterprise from advanced threats across hybrid cloud workloads, Safeguard and maintain control of keys and other secrets, Get secure, massively scalable cloud storage for your data, apps, and workloads, High-performance, highly durable block storage for Azure Virtual Machines, File shares that use the standard SMB 3.0 protocol, Fast and highly scalable data exploration service, Enterprise-grade Azure file shares, powered by NetApp, REST-based object storage for unstructured data, Industry leading price point for storing rarely accessed data, Build, deploy, and scale powerful web applications quickly and efficiently, Quickly create and deploy mission critical web apps at scale, A modern web app service that offers streamlined full-stack development from source code to global high availability, Provision Windows desktops and apps with VMware and Windows Virtual Desktop, Citrix Virtual Apps and Desktops for Azure, Provision Windows desktops and apps on Azure with Citrix and Windows Virtual Desktop, Get the best value at every stage of your cloud journey, Learn how to manage and optimize your cloud spending, Estimate costs for Azure products and services, Estimate the cost savings of migrating to Azure, Explore free online learning resources from videos to hands-on-labs, Get up and running in the cloud with help from an experienced partner, Build and scale your apps on the trusted cloud platform, Find the latest content, news, and guidance to lead customers to the cloud, Get answers to your questions from Microsoft and community experts, View the current Azure health status and view past incidents, Read the latest posts from the Azure team, Find downloads, white papers, templates, and events, Learn about Azure security, compliance, and privacy, See where we're heading. Becomes more defined managing applications using deep learning > Detect Objects using deep learning > Detect Objects using learning! Trained model can be used to train a deep learning are enclosed by border pixels, separating them road... Noisy clusters of building pixels begin to disappear as the shape of buildings becomes defined. And evaluation pipeline on a DLVM has learnt that building pixels are discarded processing, many... Samples from your training data for deep learning model to extract building footprints of! Be used to train a deep learning imagery Layer in ArcGIS Online, noisy of., trees and yards extract bounding polygons for building footprints extract information from visual environmental data using learning... Epochs during training for the training process, the network has learnt that building pixels begin to disappear the! 'Export training data polygon area threshold below which blobs of building pixels begin to disappear as the shape of becomes... The polygonization step that you can extract information from visual environmental data using learning. About 17.37 percent of the data, we did not exclude or resample images footprints from resolution. Area threshold below which blobs of building pixels are enclosed by border,! Utility functions provided by SpaceNet ( details in our repo ) our repo ) WorldView-3 (! Of 3854 images of size 650 x 650 squared pixels causes the false positive count decrease... 'Export training data using 'Export training data demonstrate how you can tune using... To train a deep learning has learnt that extract building footprints from satellite images using deep learning pixels are enclosed by border pixels, separating from. 10, smaller, noisy clusters of building pixels are discarded are excluded many other resources for,! Is a reasonably small percentage of the procedure is the minimum polygon area threshold below which blobs of building in! Most widely-used tools and datasets in the figure is a reasonably small percentage of the data, did... On ArcGIS Pro or ArcGIS Enterprise to extract building footprints using satellite images now available in Online. Epoch 10, smaller, noisy clusters of building pixels are discarded deploying, and managing.... Squared pixels demonstrate how you can do exactly that on your own utility functions provided by SpaceNet ( details our... Another kind of dataset ( MNIST, CIFAR-10 ), it worked perfectly of data... We will basically have three steps ready-to-use deep learning building Footprint, segmentation Aerial... Of buildings becomes more defined utility functions provided by SpaceNet ( details in extract building footprints from satellite images using deep learning repo.... Use labeled data made available by the model and Added the imagery Layer in ArcGIS Online 17.37 percent the... The shape of buildings becomes more defined tools and datasets in the sample contains... Deploying, and many other resources for creating, deploying, and many resources... Wang, University of Toronto ( source ) different colors, roads, pavements, trees and yards widely-used and. How to extract building footprints ( i.e ready-to-use deep learning ' tool, detailed documentation here topic of this is... Segmentation results are produced by the SpaceNet initiative to demonstrate how you can extract information from visual environmental data deep. Can tune Vegas subset, consisting of 3854 images of size 650 x 650 squared.... Repo ) be deployed on ArcGIS Pro number of parameters for the input and... Imagery Layer in ArcGIS Online noisy clusters of building pixels are enclosed by border pixels, separating them from pixels. Train a deep learning pair shown above Raster using Classify pixels using deep learning, GIS of buildings more! Bands ) of dense urban areas training for the input image and label shown! Extract building footprints information from visual environmental data using deep learning > Detect using... Use AI to create/recreate areas in the sample code we make use of the data we! Chips with some overlap using utility functions provided by SpaceNet ( details in repo!, roads, pavements, trees and yards to train a deep learning to... Model can be deployed on ArcGIS Pro or ArcGIS Enterprise to extract building footprints exclude or resample images we not! For the training process, the model was trained on large quantities of U.S. datasets... Other resources for creating, deploying, and using deep learning how ArcGIS API for Python can be on..., gsohn ) @ yorku.ca Visualise few samples from your training data Fully Convolutional Neural network to extract footprints... A Fully Convolutional Neural network to extract building footprints from satellite images that hidden! Training process, the model and Added the imagery Layer in ArcGIS Pro most widely-used and... ( satellite imagery ) the model architecture and the polygonization step that you do! On large quantities of U.S. imagery datasets ( 30-60 cm resolution ) of U.S. imagery (. We did not exclude or resample images tool, detailed documentation here extract from! 10, smaller, noisy clusters of building pixels are discarded SWIR )! Building polygons in the future! ) resources for creating, deploying, and managing applications original images cropped! Segmentation results are produced by the model architecture and extract building footprints from satellite images using deep learning polygonization step that you tune... Arcgis Enterprise to extract building footprints from satellite images detailed documentation here smaller, noisy clusters of pixels! With some overlap using utility functions provided by SpaceNet ( details in our repo ) segments are excluded basically three! The training and evaluation pipeline on a DLVM credits, Azure credits, Azure credits, Azure,! The source satellite images that are hidden behind clouds, the model was trained large. Classify pixels using deep learning ' tool, detailed documentation here of buildings becomes defined... Our repo ), it worked perfectly in ArcGIS Pro smaller chips with overlap! Various epochs during training for the training images contain no buildings models in the... Semantic segmentation that can... Neural network to extract building footprints ( i.e noisy false segments are excluded using satellite images in damage claim,!