What is Polybase?
Polybase is a new technology that integrates PDW (SQL server Parallel
Data warehouse) with Hadoop Distributed File System. It used to only work with PDW and most small
to medium enterprise that doesn’t have the appliance weren’t able benefit.
Polybase allows users to access/query non-relational data in
Hadoop, blobs, files, data from either on premise or on the cloud and run analytics
and BI on the data from within SQL server. It also provides a concept of Data
Lake where you query the data from where it is stored and once you complete
your query leave it where it was. This concept will facilitate analysis on Big
Data from its current location and reduce the costs associated in moving the
data. The following diagram is taken from Microsoft white paper and shows the
interaction that you can have with different data sources from within SQL
server when using PolyBase feature.
This feature is now available with Standalone enterprise
version of SQL Server 2016. That is really a game changer. I would love to play with this feature soon
and hopefully post on my blog here…
No comments:
Post a Comment