-
Notifications
You must be signed in to change notification settings - Fork 327
Shark Release 0.2
rxin edited this page Oct 16, 2012
·
30 revisions
Shark 0.2 is the first Shark release since the original 0.1 prototype release. The new version brings new features and performance improvements to Shark.
The major changes are documented below:
- Ram Sriharsha from Yahoo contributed a patch for the Shark Thrift server.
- Shark's Thrift server is compatible with Hive's Thrift server and can support multiple clients connecting to the same server to access the same list of cached tables.
- We have upgraded Shark to work with Hive 0.9, which introduces numerous features over the original Hive 0.7.
- Hive UDFs and UDAFs are fully supported now.
- Shark 0.2 also supports distributing resource files (e.g. jars) to the slaves using Hive's ADD FILE command.
- We have significantly simplified the deployment. As documented on the Wiki page, you can download a binary distribution of Shark 0.2 and set it up and running locally in ~ 5 mins.
- We have rewritten Shark's join and group by code. For queries that have a large number of distinct keys, join and group by performance can increase by 2X.
Shark 0.2 requires Spark 0.6 as it takes advantage of the new features and performance improvements from the new Spark release.