forked from apache/pig
-
Notifications
You must be signed in to change notification settings - Fork 0
/
CHANGES.txt
4676 lines (2503 loc) · 168 KB
/
CHANGES.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
Pig Change Log
Trunk (unreleased changes)
INCOMPATIBLE CHANGES
IMPROVEMENTS
PIG-4146: Create a target to run mr and tez unit test in one shot (daijy)
PIG-4144: Make pigunit.PigTest work in tez mode (daijy)
PIG-4128: New logical optimizer rule: ConstantCalculator (daijy)
PIG-4124: Command for Python streaming udf should be configurable (cheolsoo)
PIG-4114: Add Native operator to tez (daijy)
PIG-4117: Implement merge cogroup in Tez (daijy)
PIG-4119: Add message at end of each testcase with timestamp in Pig system tests (nmaheshwari via daijy)
PIG-4008: Pig code change to enable Tez Local mode (airbots via daijy)
PIG-4091: Predicate pushdown for ORC (rohini via daijy)
PIG-4077: Some fixes and e2e test for OrcStorage (rohini)
PIG-4054: Do not create job.jar when submitting job (daijy)
PIG-4047: Break up pig withouthadoop and fat jar (daijy)
PIG-4062: Add ascending order option to builtin TOP function (raj171 via cheolsoo)
PIG-3558: ORC support for Pig (daijy)
PIG-2122: Parameter Substitution doesn't work in the Grunt shell (daijy)
PIG-4031: Provide Counter aggregation for Tez (daijy)
PIG-4028: add a flag to control the ivy resolve/retrieve output (gkesavan via daijy)
PIG-4015: Provide a way to disable auto-parallism in tez (daijy)
PIG-3846: Implement automatic reducer parallelism (daijy)
PIG-3939: SPRINTF function to format strings using a printf-style template (mrflip via cheolsoo)
PIG-3970: Merge Tez branch into trunk (daijy)
OPTIMIZATIONS
BUG FIXES
PIG-4156: [PATCH] fix NPE when running scripts stored on hdfs:// (acoliver via daijy)
PIG-4159: TestGroupConstParallelTez and TestJobSubmissionTez should be excluded in Hadoop 20 unit tests (cheolsoo)
PIG-4154: ScriptState#setScript(File) does not close resources (lars_francke via daijy)
PIG-4155: Quitting grunt shell using CTRL-D character throws exception (abhishek.agarwal via daijy)
PIG-4157: Pig compilation failure due to HIVE-7208 (daijy)
PIG-4158: TestAssert is broken in trunk (cheolsoo)
PIG-4143: Port more mini cluster tests to Tez - part 7 (daijy)
PIG-4149: Rounding issue in FindQuantiles (daijy)
PIG-4145: Port local mode tests to Tez - part1 (daijy)
PIG-4076: Fix pom file (daijy)
PIG-4140: VertexManagerEvent.getUserPayload returns ReadOnlyBuffer after TEZ-1449 (daijy)
PIG-4136: No special handling jythonjar/jrubyjar in e2e tests after PIG-4047 (daijy)
PIG-4137: Fix hadoopversion 23 compilation due to TEZ-1469 (daijy)
PIG-4135: Fetch optimization should be disabled if plan contains no limit (cheolsoo)
PIG-4061: Make Streaming UDF work in Tez (hotfix PIG-4061-3.patch)
PIG-4134: TEZ-1449 broke the build (knoguchi)
PIG-4132: TEZ-1246 and TEZ-1390 broke a build (knoguchi)
PIG-4129: Pig -Dhadoopversion=23 compile fail after TEZ-1426 (daijy)
PIG-4127: Build failure due to TEZ-1132 and TEZ-1416 (lbendig)
PIG-4125: TEZ-1347 broke the build
PIG-4123: Increase memory for TezMiniCluster (daijy)
PIG-4122: Fix hadoopversion 23 compilation due to TEZ-1194 (daijy)
PIG-4061: Make Streaming UDF work in Tez (daijy)
PIG-4118: Fix hadoopversion 23 compilation due to TEZ-1237/TEZ-1407 (daijy)
PIG-4109: register local jar fail on Windows when Pig script is remote (daijy)
PIG-4116: Update Pig doc about Hadoop 2 Streaming Python UDF support (cheolsoo)
PIG-4112: NPE in packager when union + group-by followed by replicated join in Tez (rohini via cheolsoo)
PIG-4113: TEZ-1386 breaks hadoop 2 compilation in trunk (cheolsoo)
PIG-4110: TEZ-1382 breaks Hadoop 2 compilation (cheolsoo)
PIG-4105: Fix TestAvroStorage with ibm jdk (fang fang chen via daijy)
PIG-4108: Pig -Dhadoopversion=23 compile fail after TEZ-1317 (daijy)
PIG-4086: Fix Orc e2e tests for tez (daijy)
PIG-4101: Lower tez.am.task.max.failed.attempts to 2 from 4 in Tez mini cluster (cheolsoo)
PIG-4099: "ant copypom" failed with "could not find file $PIG_HOME/ivy/pig.pom to copy" (fang fang chen via cheolsoo)
PIG-4098: Vertex Location Hint api update after TEZ-1041 (jeagles via cheolsoo)
PIG-4088: TEZ-1346 breaks hadoop 2 compilation in trunk (cheolsoo)
PIG-4089: TestMultiQuery.testMultiQueryJiraPig1169 fails in trunk after
PIG-4079 in Hadoop 1 (cheolsoo)
PIG-4085: TEZ-1303 broke hadoop 2 compilation in trunk (cheolsoo)
PIG-4082: TEZ-1278 broke hadoop 2 compilation in trunk (cheolsoo)
PIG-4079: Parallel clause is not honored in local mode (cheolsoo)
PIG-4078: Port more mini cluster tests to Tez - part 6 (rohini)
PIG-4071: Fix TestStore.testSetStoreSchema, TestParamSubPreproc.testGruntWithParamSub,
TestJobSubmission.testReducerNumEstimation (daijy)
PIG-4074: mapreduce.client.submit.file.replication is not honored in cached files (cheolsoo)
PIG-4052: TestJobControlSleep, TestInvokerSpeed are unreliable (daijy)
PIG-4053: TestMRCompiler succeeded with sun jdk 1.6 while failed with sun jdk 1.7 (daijy)
PIG-3982: ant target test-tez should depend on jackson-pig-3039-test-download (daijy)
PIG-4064: Fix tez auto parallelism test failures (daijy)
PIG-4075: TEZ-1311 broke Hadoop2 compilation (cheolsoo)
PIG-4070: Change from TezJobConfig to TezRuntimeConfiguration (rohini)
PIG-4068: ObjectCache causes ClassCastException (cheolsoo)
PIG-4067: TestAllLoader in piggybank fails with new hive version (rohini)
PIG-4065: Fix failing unit tests in Tez (rohini)
PIG-4060: Refactor TezJob and TezLauncher (cheolsoo)
PIG-2689: JsonStorage fails to find schema when LimitAdjuster runs (rohini)
PIG-4056: Remove PhysicalOperator.setAlias (rohini)
PIG-4058: Use single config in Tez for input and output (rohini)
PIG-3886: UdfDistributedCache_1 fails in tez branch (cheolsoo)
PIG-4055 Build broke after TEZ-1130 API rename (knoguchi)
PIG-3935: Port more mini cluster tests to Tez - part 5 (rohini)
PIG-3984: PigServer.shutdown removes the tez resource folder (daijy via rohini)
PIG-4048: TEZ-692 has a incompatible API change removing TezSession (rohini)
PIG-4044: Pig should use avro-mapred-hadoop2.jar instead of avro-mapred.jar when compile with hadoop 2 (daijy)
PIG-4043: JobClient.getMap/ReduceTaskReports() causes OOM for jobs with a large number of tasks (cheolsoo)
PIG-4036: Fix e2e failures - JobManagement_3, CmdErrors_3 and BigData_4 (daijy)
PIG-4041: org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper compiling error (jeagles via cheolsoo)
PIG-4038: SPRINTF should return NULL on any NULL input (mrflip via daijy)
PIG-4025: TestLoadFuncWrapper, TestLoadFuncMetaDataWrapper,TestStoreFuncWrapper
and TestStoreFuncMetadataWrapper fail on IBM JDK (ahireanup via daijy)
PIG-4024: TestPigStreamingUDF and TestPigStreaming fail on IBM JDK (ahireanup via daijy)
PIG-4023: BigDec/Int sort is broken (ahireanup via daijy)
PIG-4003: Error is thrown by JobStats.getOutputSize() when storing to a Hive table (cheolsoo)
PIG-4035: Fix CollectedGroup e2e tests for tez (daijy)
PIG-4034: Exclude TestTezAutoParallelism when -Dhadoopversion=20 (cheolsoo)
PIG-4033: Fix MergeSparseJoin e2e tests on tez (daijy)
PIG-3478: Make StreamingUDF work for Hadoop 2 (lbendig via daijy)
PIG-4032: BloomFilter fails with s3 path in Hadoop 2.4 (cheolsoo)
PIG-4018: Schema validation fails with UNION ONSCHEMA (daijy)
PIG-4022: Fix tez e2e test SkewedJoin_6 (daijy)
PIG-4001: POPartialAgg aggregates too aggressively when multiple values aggregated (tmwoodruff via cheolsoo)
PIG-4027: Always check for latest Tez snapshot dependencies (lbendig via cheolsoo)
PIG-4020: Fix tez e2e tests MapPartialAgg_[2-4], StreamingPerformance_[6-7] (daijy)
PIG-4019: Compilation broken after TEZ-1169 (daijy)
PIG-4014: Fix Rank e2e test failures on tez (daijy)
PIG-4013: Order by multiple column fail on Tez (daijy)
PIG-3983: TestGrunt.testKeepGoigFailed fail on tez mode (daijy)
PIG-3959: Skewed join followed by replicated join fails in Tez (cheolsoo)
PIG-3995: Tez unit tests shouldn't run when -Dhadoopversion=20 (cheolsoo)
PIG-3986: PigSplit to support multiple split class (tongjie via cheolsoo)
PIG-3988: PigStorage: CommandLineParser is not thread safe (tmwoodruff via cheolsoo)
PIG-2409: Pig show wrong tracking URL for hadoop 2 (lbendig via rohini)
PIG-3978: Container reuse does not across PigServer (daijy)
PIG-3974: E2E test data generation fails in cluster mode (lbendig via cheolsoo)
PIG-3969: Javascript UDF fails if no output schema is defined (lbendig via cheolsoo)
PIG-3971: Pig on tez fails to run in Oozie in secure cluster (rohini)
PIG-3968: OperatorPlan.serialVersionUID is not defined (daijy)
Release 0.13.1 - Unreleased
INCOMPATIBLE CHANGES
IMPROVEMENTS
OPTIMIZATIONS
BUG FIXES
PIG-4139: pig query throws error java.lang.NoSuchFieldException: jobsInProgress on MRv1 (satish via cheolsoo)
PIG-4133: Need to update the default $HCAT_HOME dir in the PIG script (mnarayan via cheolsoo)
PIG-4106: Describe shouldn't trigger execution in batch mode (cheolsoo)
Release 0.13.0
INCOMPATIBLE CHANGES
PIG-3996: Delete zebra from svn (cheolsoo)
PIG-3898: Refactor PPNL for non-MR execution engine (cheolsoo)
PIG-3485: Remove CastUtils.bytesToMap(byte[] b) method from LoadCaster interface (cheolsoo)
PIG-3419: Pluggable Execution Engine (achalsoni81 via cheolsoo)
PIG-2207: Support custom counters for aggregating warnings from different udfs (aniket486)
IMPROVEMENTS
PIG-3892: Pig distribution for hadoop 2 (daijy)
PIG-4006: Make the interval of DAGStatus report configurable (cheolsoo)
PIG-3999: Document PIG-3388 (lbendig via cheolsoo)
PIG-3954: Document use of user level jar cache (aniket486)
PIG-3752: Fix e2e Parallel test for Windows (daijy)
PIG-3966: Document variable input arguments of UDFs (lbendig via aniket486)
PIG-3963: Documentation for BagToString UDF (mrflip via daijy)
PIG-3929: pig.temp.dir should allow to substitute vars as hadoop configuration does (aniket486)
PIG-3913: Pig should use job's jobClient wherever possible (fixes local mode counters) (aniket486)
PIG-3941: Piggybank's Over UDF returns an output schema with named fields (mrflip via cheolsoo)
PIG-3545: Seperate validation rules from optimizer (daijy)
PIG-3745: Document auto local mode for pig (aniket486)
PIG-3932: Document ROUND_TO builtin UDF (mrflip via cheolsoo)
PIG-3926: ROUND_TO function: rounds double/float to fixed number of decimal places (mrflip via cheolsoo)
PIG-3901: Organize the Pig properties file and document all properties (mrflip via cheolsoo)
PIG-3867: Added hadoop home to build classpath for build pig with unit test on windows (Sergey Svinarchuk via gates)
PIG-3914: Change TaskContext to abstract class (cheolsoo)
PIG-3672: Pig should not check for hardcoded file system implementations (rohini)
PIG-3860: Refactor PigStatusReporter and PigLogger for non-MR execution engine (cheolsoo)
PIG-3865: Remodel the XMLLoader to work to be faster and more maintainable (aseldawy via daijy)
PIG-3737: Bundle dependent jars in distribution in %PIG_HOME%/lib folder (daijy)
PIG-3771: Piggybank Avrostorage makes a lot of namenode calls in the backend (rohini)
PIG-3851: Upgrade jline to 2.11 (daijy)
PIG-3884: Move multi store counters to PigStatsUtil from MRPigStatsUtil (rohini)
PIG-3591: Refactor POPackage to separate MR specific code from packaging (mwagner via cheolsoo)
PIG-3449: Move JobCreationException to org.apache.pig.backend.hadoop.executionengine (cheolsoo)
PIG-3765: Ability to disable Pig commands and operators (prkommireddi)
PIG-3731: Ability to specify local-mode specific configuration (useful for local/auto-local mode) (aniket486)
PIG-3793: Provide info on number of LogicalRelationalOperator(s) used in the script through LogicalPlanData (prkommireddi)
PIG-3778: Log list of running jobs along with progress (rohini)
PIG-3675: Documentation for AccumuloStorage (elserj via daijy)
PIG-3648: Make the sample size for RandomSampleLoader configurable (cheolsoo)
PIG-259: allow store to overwrite existing directroy (nezihyigitbasi via daijy)
PIG-2672: Optimize the use of DistributedCache (aniket486)
PIG-3238: Pig current releases lack a UDF Stuff(). This UDF deletes a specified length of characters
and inserts another set of characters at a specified starting point (nezihyigitbasi via daijy)
PIG-3299: Provide support for LazyOutputFormat to avoid creating empty files (lbendig via daijy)
PIG-3642: Direct HDFS access for small jobs (fetch) (lbendig via cheolsoo)
PIG-3730: Performance issue in SelfSpillBag (rajesh.balamohan via rohini)
PIG-3654: Add class cache to PigContext (tmwoodruff via daijy)
PIG-3463: Pig should use hadoop local mode for small jobs (aniket486)
PIG-3573: Provide StoreFunc and LoadFunc for Accumulo (elserj via daijy)
PIG-3653: Add support for pre-deployed jars (tmwoodruff via daijy)
PIG-3645: Move FileLocalizer.setR() calls to unit tests (cheolsoo)
PIG-3637: PigCombiner creating log spam (rohini)
PIG-3632: Add option to configure cacheBlocks in HBaseStorage (rohini)
PIG-3619: Provide XPath function (Saad Patel via gates)
PIG-3590: remove PartitionFilterOptimizer from trunk (aniket486)
PIG-3580: MIN, MAX and AVG functions for BigDecimal and BigInteger (harichinnan via cheolsoo)
PIG-3569: SUM function for BigDecimal and BigInteger (harichinnan via rohini)
PIG-3505: Make AvroStorage sync interval take default from io.file.buffer.size (rohini)
PIG-3563: support adding archives to the distributed cache (jdonofrio via cheolsoo)
PIG-3388: No support for Regex for row filter in org.apache.pig.backend.hadoop.hbase.HBaseStorage (lbendig via cheolsoo)
PIG-3522: Remove shock from pig (daijy)
PIG-3295: Casting from bytearray failing after Union even when each field is from a single Loader (knoguchi)
PIG-3444: CONCAT with 2+ input parameters fail (lbendig via daijy)
PIG-3117: A debug mode in which pig does not delete temporary files (ihadanny via cheolsoo)
PIG-3484: Make the size of pig.script property configurable (cheolsoo)
OPTIMIZATIONS
PIG-3882: Multiquery off mode execution is not done in batch and very inefficient (rohini)
BUG FIXES
PIG-4037: TestHBaseStorage, TestAccumuloPigCluster has failures with hadoopversion=23 (daijy)
PIG-4005: depend on hbase-hadoop2-compat rather than hbase-hadoop1-compat when hbaseversion is 95 (daijy)
PIG-4021: Fix TestHBaseStorage failure after auto local mode change (PIG-3463) (daijy)
PIG-4029: TestMRCompiler is broken after PIG-3874 (daijy)
PIG-4030: TestGrunt, TestPigRunner fail after PIG-3892 (daijy)
PIG-3975: Multiple Scalar reference calls leading to missing records (knoguchi via rohini)
PIG-4017: NPE thrown from JobControlCompiler.shipToHdfs (cheolsoo)
PIG-3997: Issue on Pig docs: Testing and Diagnostics (zjffdu via cheolsoo)
PIG-3998: Documentation fix: invalid page links, wrong Groovy udf example (lbendig via cheolsoo)
PIG-4000: Minor documentation fix for PIG-3642 (lbendig via cheolsoo)
PIG-3991: TestErrorHandling.tesNegative7 is broken in trunk/branch-0.13 (cheolsoo)
PIG-3990: ant docs is broken in trunk/branch-0.13 (cheolsoo)
PIG-3989: PIG_OPTS does not work with some version of HADOOP (daijy)
PIG-3739: The Warning_4 e2e test is broken in trunk (aniket486)
PIG-3976: Typo correction in JobStats breaks Oozie (rohini)
PIG-3874: FileLocalizer temp path can sometimes be non-unique (chitnis via cheolsoo)
PIG-3967: Grunt fail if we running more statement after first store (daijy)
PIG-3915: MapReduce queries in Pigmix outputs different results than Pig's (keren3000 via daijy)
PIG-3955: Remove url.openStream() file descriptor leak from JCC (aniket486)
PIG-3958: TestMRJobStats is broken in 0.13 and trunk (aniket486)
PIG-3949: HiveColumnarStorage compile failure with Hive 0.14.0 (daijy)
PIG-3960: Compile fail against Hadoop 2.4.0 after PIG-3913 (daijy)
PIG-3956: UDF profile is often misleading (cheolsoo)
PIG-3950: Removing empty file PColFilterExtractor.java speeds up rebuilds (mrflip via cheolsoo)
PIG-3940: NullPointerException writing .pig_header for field with null name in JsonMetadata.java (mrflip via cheolsoo)
PIG-3944: PigNullableWritable toString method throws NPE on null value (mauzhang via cheolsoo)
PIG-3936: DBStorage fails on storing nulls for non varchar columns (jeremykarn via cheolsoo)
PIG-3945: Ant not sending hadoopversion to piggybank sub-ant (mrflip via cheolsoo)
PIG-3942: Util.buildPp() is incompatible with Non-MR execution engine (cheolsoo)
PIG-3902: PigServer creates cycle (thedatachef via cheolsoo)
PIG-3930: "java.io.IOException: Cannot initialize Cluster" in local mode with hadoopversion=23 dependencies (jira.shegalov via cheolsoo)
PIG-3921: Obsolete entries in piggybank javadoc build script (mrflip via cheolsoo)
PIG-3923: Gitignore file should ignore all generated artifacts (mrflip via cheolsoo)
PIG-3922: Increase Forrest heap size to avoid OutOfMemoryError building docs (mrflip via cheolsoo)
PIG-3916: isEmpty should not be early terminating (rohini)
PIG-3859: auto local mode should not modify reducer configuration (aniket486)
PIG-3909: Type Casting issue (daijy)
PIG-3905: 0.12.1 release can't be build for Hadoop2 (daijy)
PIG-3894: Datetime function AddDuration, SubtractDuration and all Between functions don't check for null values in the input tuple (jennythompson via cheolsoo)
PIG-3889: Direct fetch doesn't set job submission timestamps (cheolsoo)
PIG-3895: Pigmix run script has compilation error (rohini)
PIG-3885: AccumuloStorage incompatible with Accumulo 1.6.0 (elserj via daijy)
PIG-3888: Direct fetch doesn't differentiate between frontend and backend sides (lbendig via daijy)
PIG-3887: TestMRJobStats is broken in trunk (cheolsoo)
PIG-3868: Fix Iterator_1 e2e test on windows (ssvinarchukhorton via rohini)
PIG-3871: Replace org.python.google.* with com.google.* in imports (cheolsoo)
PIG-3858: PigLogger/PigStatusReporter is not set for fetch tasks (lbendig via cheolsoo)
PIG-3798: Registered jar in pig script are appended to the classpath multiple times (cheolsoo)
PIG-3844: Make ScriptState InheritableThreadLocal for threads that need it (amatsukawa via cheolsoo)
PIG-3837: ant pigperf target is broken in trunk (cheolsoo)
PIG-3836: Pig signature has has guava version dependency (amatsukawa via cheolsoo)
PIG-3832: Fix piggybank test compilation failure after PIG-3449 (rohini)
PIG-3807: Pig creates wrong schema after dereferencing nested tuple fields with sorts (daijy)
PIG-3802: Fix TestBlackAndWhitelistValidator failures (prkommireddi)
PIG-3815: Hadoop bug causes to pig to fail silently with jar cache (aniket486)
PIG-3816: Incorrect Javadoc for launchPlan() method (kyungho via prkommireddi)
PIG-3673: Divide by zero error in runpigmix.pl script (suhassatish via daijy)
PIG-3805: ToString(datetime [, format string]) doesn't work without the second argument (jennythompson via daijy)
PIG-3809: AddForEach optimization doesn't set the alias of the added foreach (cheolsoo)
PIG-3811: PigServer.registerScript() wraps exception incorrectly on parsing errors (prkommireddi)
PIG-3806: PigServer constructor throws NPE after PIG-3765 (aniket486)
PIG-3801: Auto local mode does not call storeSchema (aniket486)
PIG-3754: InputSizeReducerEstimator.getTotalInputFileSize reports incorrect size (aniket486)
PIG-3679: e2e StreamingPythonUDFs_10 fails in trunk (cheolsoo)
PIG-3776: Conflicting versions of jline is present in trunk (cheolsoo)
PIG-3674: Fix TestAccumuloPigCluster on Hadoop 2 (elserj via daijy)
PIG-3740: Document direct fetch optimization (lbendig via cheolsoo)
PIG-3746: NPE is thrown if Pig fails before PigStats is intialized (cheolsoo)
PIG-3747: Update skewed join documentation (cheolsoo)
PIG-3755: auto local mode selection does not check lower bound for size (aniket486)
PIG-3447: Compiler warning message dropped for CastLineageSetter and others with no enum kind (knoguchi via cheolsoo)
PIG-3627: Json storage : Doesn't work in cases , where other Store Functions (like PigStorage / AvroStorage)
do work (ssvinarchukhorton via daijy)
PIG-3606: Pig script throws error when searching for hcatalog jars in latest hive (deepesh via daijy)
PIG-3623: HBaseStorage: setting loadKey and noWAL to false doesn't have any affect (nezihyigitbasi via rohini)
PIG-3744: SequenceFileLoader does not support BytesWritable (rohini)
PIG-3726: Ranking empty records leads to NullPointerException (jarcec via daijy)
PIG-3652: Pigmix parser (PigPerformanceLoader) deletes chars during parsing (keren3000 via daijy)
PIG-3722: Udf deserialization for registered classes fails in local_mode (aniket486)
PIG-3641: Split "otherwise" producing incorrect output when combined with ColumnPruning (knoguchi)
PIG-3682: mvn-inst target does not install pig-h2.jar into local .m2 (raluri via aniket486)
PIG-3511: Security: Pig temporary directories might have world readable permissions (rohini)
PIG-3664: Piggy Bank XPath UDF can't be called (nezihyigitbasi via daijy)
PIG-3662: Static loadcaster in BinStorage can cause exception (lbendig via rohini)
PIG-3617: problem with temp file deletion in MAPREDUCE operator (nezihyigitbasi via cheolsoo)
PIG-3649: POPartialAgg incorrectly calculates size reduction when multiple values aggregated (tmwoodruff via daijy)
PIG-3650: Fix for PIG-3100 breaks column pruning (tmwoodruff via daijy)
PIG-3643: Nested Foreach with UDF and bincond is broken (cheolsoo)
PIG-3616: TestBuiltIn.testURIwithCurlyBrace() silently fails (lbendig via cheolsoo)
PIG-3608: ClassCastException when looking up a value from AvroMapWrapper using a Utf8 key (rding)
PIG-3639: TestRegisteredJarVisibility is broken in trunk (cheolsoo)
PIG-3640: Retain intermediate files for debugging purpose in batch mode (cheolsoo)
PIG-3609: ClassCastException when calling compareTo method on AvroBagWrapper (rding via cheolsoo)
PIG-3584: AvroStorage does not correctly translate arrays of strings (jadler via cheolsoo)
PIG-3633: AvroStorage tests are failing when running against Avro 1.7.5 (jarcec via cheolsoo)
PIG-3612: Storing schema does not work cross cluster with PigStorage and JsonStorage (rohini)
PIG-3607: PigRecordReader should report progress for each inputsplit processed (rohini)
PIG-3566: Cannot set useMatches of REGEX_EXTRACT_ALL and REGEX_EXTRACT (nezihyigitbasi via cheolsoo)
PIG-2132: [Piggybank] MIN and MAX functions should ignore nulls (rekhajoshm via cheolsoo)
PIG-3581: Incorrect scope resolution with nested foreach (aniket486)
PIG-3285: Jobs using HBaseStorage fail to ship dependency jars (ndimiduk via cheolsoo)
PIG-3582: Document SUM, MIN, MAX, and AVG functions for BigInteger and BigDecimal (harichinnan via cheolsoo)
PIG-3525: PigStats.get() and ScriptState.get() shouldn't return MR-specific objects (cheolsoo)
PIG-3568: Define the semantics of POStatus.STATUS_NULL (mwagner via cheolsoo)
PIG-3561: Clean up PigStats and JobStats after PIG-3419 (cheolsoo)
PIG-3553: HadoopJobHistoryLoader fails to load job history on hadoop v 1.2 (lgiri via cheolsoo)
PIG-3559: Trunk is broken by PIG-3522 (cheolsoo)
PIG-3551: Minor typo on pig latin basics page (elserj via aniket486)
PIG-3526: Unions with Enums do not work with AvroStorage (jadler via cheolsoo)
PIG-3377: New AvroStorage throws NPE when storing untyped map/array/bag (jadler via cheolsoo)
PIG-3542: Javadoc of REGEX_EXTRACT_ALL (nyigitba via daijy)
PIG-3518: Need to ship jruby.jar in the release (daijy)
PIG-3524: Clean up Launcher and MapReduceLauncher after PIG-3419 (cheolsoo)
PIG-3515: Shell commands are limited from OS buffer (andronat via cheolsoo)
PIG-3520: Provide backward compatibility for PigRunner and PPNL after PIG-3419 (daijy via cheolsoo)
PIG-3519: Remove dependency on uber avro-tools jar (jarcec via cheolsoo)
PIG-3451: EvalFunc<T> ctor reflection to determine value of type param T is brittle (hazen via aniket486)
PIG-3509: Exception swallowing in TOP (vrajaram via aniket486)
PIG-3506: FLOOR documentation references CEIL function instead of FLOOR (seshness via daijy)
PIG-3497: JobControlCompiler should only do reducer estimation when the job has a reduce phase (amatsukawa via aniket486)
PIG-3469: Skewed join can cause unrecoverable NullPointerException when one of its inputs is missing (Jarek Jarcec Cecho via xuefuz)
PIG-3496: Propagate HBase 0.95 jars to the backend (Jarek Jarcec Cecho via xuefuz)
Release 0.12.1 (unreleased changes)
IMPROVEMENTS
PIG-3529: Upgrade HBase dependency from 0.95-SNAPSHOT to 0.96 (jarcec via daijy)
PIG-3552: UriUtil used by reducer estimator should support viewfs (amatsukawa via aniket486)
PIG-3549: Print hadoop jobids for failed, killed job (aniket486)
PIG-3047: Check the size of a relation before adding it to distributed cache in Replicated join (aniket486)
PIG-3480: TFile-based tmpfile compression crashes in some cases (dvryaboy via aniket486)
BUG FIXES
PIG-3661: Piggybank AvroStorage fails if used in more than one load or store statement (rohini)
PIG-3819: e2e tests containing "perl -e "print $_;" fails on Hadoop 2 (daijy)
PIG-3813: Rank column is assigned different uids everytime when schema is reset (cheolsoo)
PIG-3833: Relation loaded by AvroStorage with schema is projected incorrectly in foreach statement (jeongjinku via cheolsoo)
PIG-3794: pig -useHCatalog fails using pig command line interface on HDInsight (ehans via daijy)
PIG-3827: Custom partitioner is not picked up with secondary sort optimization (daijy)
PIG-3826: Outer join with PushDownForEachFlatten generates wrong result (daijy)
PIG-3820: TestAvroStorage fail on some OS (daijy)
PIG-3818: PIG-2499 is accidently reverted (daijy)
PIG-3516: pig does not bring in joda-time as dependency in its pig-template.xml (daijy)
PIG-3753: LOGenerate generates null schema (daijy)
PIG-3782: PushDownForEachFlatten + ColumnMapKeyPrune with user defined schema failing due to incorrect UID assignment (knoguchi via daijy)
PIG-3779: Assert constructs ConstantExpression with null when no comment is given (thedatachef via cheolsoo)
PIG-3777: Pig 12.0 Documentation (karinahauser via daijy)
PIG-3774: Piggybank Over UDF get wrong result (daijy)
PIG-3657: New partition filter extractor fails with NPE (cheolsoo)
PIG-3347: Store invocation brings side effect (daijy)
PIG-3670: Fix assert in Pig script (daijy)
PIG-3741: Utils.setTmpFileCompressionOnConf can cause side effect for SequenceFileInterStorage (aniket486)
PIG-3677: ConfigurationUtil.getLocalFSProperties can return an inconsistent property set (rohini)
PIG-3621: Python Avro library can't read Avros made with builtin AvroStorage (rusell.jurney via cheolsoo)
PIG-3592: Should not try to create success file for non-fs schemes like hbase (rohini)
PIG-3572: Fix all unit test for during build pig with Hadoop 2.X on Windows (ssvinarchukhorton via daijy)
PIG-2629: Wrong Usage of Scalar which is null causes high namenode operation (rohini)
PIG-3593: Import jython standard module fail on cluster (daijy)
PIG-3576: NPE due to PIG-3549 when job never gets submitted (lbendig via cheolsoo)
PIG-3567: LogicalPlanPrinter throws OOM for large scripts (aniket486)
PIG-3579: pig.script's deserialized version does not maintain line numbers (jgzhang via aniket486)
PIG-3570: Rollback PIG-3060 (daijy)
PIG-3530: Some e2e tests is broken due to PIG-3480 (daijy)
PIG-3492: ColumnPrune dropping used column due to LogicalRelationalOperator.fixDuplicateUids changes not propagating (knoguchi via daijy)
PIG-3325: Adding a tuple to a bag is slow (dvryaboy via aniket486)
PIG-3512: Reducer estimater is broken by PIG-3497
PIG-3510: New filter extractor fails with more than one filter statement (aniket486 via cheolsoo)
Release 0.12.0
INCOMPATIBLE CHANGES
PIG-3082: outputSchema of a UDF allows two usages when describing a Tuple schema (jcoveney)
PIG-3191: [piggybank] MultiStorage output filenames are not sortable (Danny Antonelli via jcoveney)
PIG-3174: Remove rpm and deb artifacts from build.xml (gates)
IMPROVEMENTS
PIG-3503: More document for Pig 0.12 new features (daijy)
PIG-3445: Make Parquet format available out of the box in Pig (lbendig via aniket486)
PIG-3483: Document ASSERT keyword (aniket486 via daijy)
PIG-3470: Print configuration variables in grunt (lbendig via daijy)
PIG-3493: Add max/min for datetime (tyro89 via daijy)
PIG-3479: Fix BigInt, BigDec, Date serialization. Improve perf of PigNullableWritable deserilization (dvryaboy)
PIG-3461: Rewrite PartitionFilterOptimizer to make it work for all the cases (aniket486)
PIG-2417: Streaming UDFs - allow users to easily write UDFs in scripting languages with no
JVM implementation. (jeremykarn via daijy)
PIG-3199: Provide a method to retriever name of loader/storer in PigServer (prkommireddi via daijy)
PIG-3367: Add assert keyword (operator) in pig (aniket486)
PIG-3235: Avoid extra byte array copies in streaming (rohini)
PIG-3065: pig output format/committer should support recovery for hadoop 0.23 (daijy)
PIG-3390: Make pig working with HBase 0.95 (jarcec via daijy)
PIG-3431: Return more information for parsing related exceptions. (jeremykarn via daijy)
PIG-3430: Add xml format for explaining MapReduce Plan. (jeremykarn via daijy)
PIG-3048: Add mapreduce workflow information to job configuration (billie.rinaldi via daijy)
PIG-3436: Make pigmix run with Hadoop2 (rohini)
PIG-3424: Package import list should consider class name as is first even if -Dudf.import.list is passed (rohini)
PIG-3204: Change script parsing to parse entire script instead of line by line (rohini)
PIG-3359: Register Statements and Param Substitution in Macros (jpacker via cheolsoo)
PIG-3182: Pig currently lacks functions to trim the whitespace only on one hand side (sarutak via cheolsoo)
PIG-3163: Pig current releases lack a UDF endsWith. This UDF tests if a given string ends with the specified suffix (sriramkrishnan via cheolsoo)
PIG-3015: Rewrite of AvroStorage (jadler via cheolsoo)
PIG-3361: Improve Hadoop version detection logic for Pig unit test (daijy)
PIG-3280: Document IN operator and CASE expression (cheolsoo)
PIG-3342: Allow conditions in case statement (cheolsoo)
PIG-3327: Pig hits OOM when fetching task reports (rohini)
PIG-3336: Change IN operator to use or-expressions instead of EvalFunc (cheolsoo)
PIG-3339: Move pattern compilation in ToDate as a static variable (rohini)
PIG-3332: Upgrade Avro dependency to 1.7.4 (nielsbasjes via cheolsoo)
PIG-3307: Refactor physical operators to remove methods parameters that are always null (julien)
PIG-3317: disable optimizations via pig properties (traviscrawford via billgraham)
PIG-3321: AVRO: Support user specified schema on load (harveyc via rohini)
PIG-2959: Add a pig.cmd for Pig to run under Windows (daijy)
PIG-3311: add pig-withouthadoop-h2 to mvn-jar (julien)
PIG-2873: Converting bin/pig shell script to python (vikram.dixit via daijy)
PIG-3308: Storing data in hive columnar rc format (maczech via daijy)
PIG-3303: add hadoop h2 artifact to publications in ivy.xml (julien)
PIG-3169: Remove intermediate data after a job finishes (mwagner via cheolsoo)
PIG-3173: Partition filter push down does not happen when partition keys condition include a AND and OR construct (rohini)
PIG-2786: enhance Pig launcher script wrt. HBase/HCat integration (ndimiduk via daijy)
PIG-3198: Let users use any function from PigType -> PigType as if it were builtlin (jcoveney)
PIG-3268: Case statement support (cheolsoo)
PIG-3269: In operator support (cheolsoo)
PIG-200: Pig Performance Benchmarks (daijy)
PIG-3261: User set PIG_CLASSPATH entries must be prepended to the CLASSPATH, not
appended (qwertymaniac via daijy)
PIG-3141: Giving CSVExcelStorage an option to handle header rows (jpacker via cheolsoo)
PIG-3217: Add support for DateTime type in Groovy UDFs (herberts via daijy)
PIG-3218: Add support for biginteger/bigdecimal type in Groovy UDFs (herberts via daijy)
PIG-3248: Upgrade hadoop-2.0.0-alpha to hadoop-2.0.3-alpha (daijy)
PIG-3235: Add log4j.properties for unit tests (cheolsoo)
PIG-3236: parametrize snapshot and staging repo id (gkesavan via daijy)
PIG-3244: Make PIG_HOME configurable ([email protected] via daijy)
PIG-3233: Deploy a Piggybank Jar (njw45 via cheolsoo)
PIG-3245: Documentation about HBaseStorage (Daisuke Kobayashi via cheolsoo)
PIG-3211: Allow default Load/Store funcs to be configurable (prkommireddi via cheolsoo)
PIG-3136: Introduce a syntax making declared aliases optional (jcoveney via cheolsoo)
PIG-3142: [piggybank] Fixed-width load and store functions for the Piggybank (jpacker via cheolsoo)
PIG-3162: PigTest.assertOutput doesn't allow non-default delimiter (dreambird via cheolsoo)
PIG-3002: Pig client should handle CountersExceededException (jarcec via billgraham)
PIG-3189: Remove ivy/pig.pom and improve build mvn targets (billgraham)
PIG-3192: Better call to action to download Pig in docs (rjurney via jcoveney)
PIG-3167: Job stats are printed incorrectly for map-only jobs (Mark Wagner via jcoveney)
PIG-3131: Document PluckTuple UDF (rjurney via jcoveney)
PIG-3098: Add another test for the self join case (jcoveney)
PIG-3129: Document syntax to refer to previous relation (rjurney via jcoveney)
PIG-2553: Pig shouldn't allow attempts to write multiple relations into same directory (prkommireddi via cheolsoo)
PIG-3179: Task Information Header only prints out the first split for each task (knoguchi via rohini)
PIG-3108: HBaseStorage returns empty maps when mixing wildcard with other columns (christoph.bauer via billgraham)
PIG-3178: Print a stacktrace when ExecutableManager hits an OOM (knoguchi via rohini)
PIG-3160: GFCross uses unnecessary loop (sandyr via cheolsoo)
PIG-3138: Decouple PigServer.executeBatch() from compilation of batch (pkommireddi via cheolsoo)
PIG-2878: Pig current releases lack a UDF equalIgnoreCase.This function returns a Boolean value indicating whether string left is equal to string right. This
check is case insensitive. (shami via gates)
PIG-2994: Grunt shortcuts (prasanth_j via cheolsoo)
PIG-3140: Document PigProgressNotificationListener configs (billgraham)
PIG-3139: Document reducer estimation (billgraham)
PIG-2764: Add a biginteger and bigdecimal type to pig (jcoveney)
PIG-3073: POUserFunc creating log spam for large scripts (jcoveney)
PIG-3124: Push FLATTENs After FILTERs If Possible (nwhite via daijy)
PIG-3086: Allow A Prefix To Be Added To URIs In PigUnit Tests (nwhite via gates)
PIG-3091: Make schema, header and stats file configurable in JsonMetadata (pkommireddi via jcoveney)
PIG-3078: Make a UDF that, given a string, returns just the columns prefixed by that string (jcoveney)
PIG-3090: Introduce a syntax to be able to easily refer to the previously defined relation (jcoveney)
PIG-3057: Make PigStorage.readField() protected (pablomar and billgraham via billgraham)
PIG-2788: improved string interpolation of variables (jcoveney)
PIG-2362: Rework Ant build.xml to use macrodef instead of antcall (azaroth via cheolsoo)
PIG-2857: Add a -tagPath option to PigStorage (prkommireddi via cheolsoo)
PIG-2341: Need better documentation on Pig/HBase integration (jthakrar and billgraham via billgraham)
PIG-3075: Allow AvroStorage STORE Operations To Use Schema Specified By URI (nwhite via cheolsoo)
PIG-3062: Change HBaseStorage to permit overriding pushProjection (billgraham)
PIG-3016: Modernize more tests (jcoveney via cheolsoo)
PIG-2582: Store size in bytes (not mbytes) in ResourceStatistics (prkommireddi via billgraham)
PIG-3006: Modernize a chunk of the tests (jcoveney via cheolsoo)
PIG-2997: Provide a convenience constructor on PigServer that accepts Configuration (prkommireddi via rohini)
PIG-2933: HBaseStorage is using setScannerCaching which is deprecated (prkommireddi via rohini)
PIG-2881: Add SUBTRACT eval function (jocosti via cheolsoo)
PIG-3004: Improve exceptions messages when a RuntimeException is raised in Physical Operators (julien)
PIG-2990: the -secretDebugCmd shouldn't be a secret and should just be...a command (jcoveney)
PIG-2941: Ivy resolvers in pig don't have consistent chaining and don't have a kitchen sink option for novices (jgordon via azaroth)
PIG-2778: Add 'matches' operator to predicate pushdown (cheolsoo via jcoveney)
PIG-2966: Test failures on CentOS 6 because MALLOC_ARENA_MAX is not set (cheolsoo via sms)
PIG-2794: Pig test: add utils to simplify testing on Windows (jgordon via gates)
PIG-2910: Add function to read schema from outout of Schema.toString() (initialcontext via thejas)
OPTIMIZATIONS
PIG-3395: Large filter expression makes Pig hang (cheolsoo)
PIG-3123: Simplify Logical Plans By Removing Unneccessary Identity Projections (njw45 via cheolsoo)
PIG-3013: BinInterSedes improve chararray sort performance (rohini)
BUG FIXES
PIG-3504: Fix e2e Describe_cmdline_12 (cheolsoo via daijy)
PIG-3128: Document the BigInteger and BigDecimal data type (daijy via cheolsoo)
PIG-3495: Streaming udf e2e tests failures on Windows (daijy)
PIG-3292: Logical plan invalid state: duplicate uid in schema during self-join to get cross product (cheolsoo via daijy)
PIG-3491: Fix e2e failure Jython_Diagnostics_4 (daijy)
PIG-3114: Duplicated macro name error when using pigunit (daijy)
PIG-3370: Add New Reserved Keywords To The Pig Docs (cheolsoo)
PIG-3487: Fix syntax errors in nightly.conf (arpitgupta via daijy)
PIG-3458: ScalarExpression lost with multiquery optimization (knoguchi)
PIG-3360: Some intermittent negative e2e tests fail on hadoop 2 (daijy)
PIG-3468: PIG-3123 breaks e2e test Jython_Diagnostics_2 (daijy)