Pig Cogroup Operator¶
The only difference between the two operators is that the group operator is normally used with one relation, while the cogroup operator is used in statements involving two or more relations.
Grouping Two Relations using Cogroup¶
001,Robin,22,newyork 002,BOB,23,Kolkata 003,Maya,23,Tokyo 004,Sara,25,London 005,David,23,Bhuwaneshwar 006,Maggy,22,Chennai
$ hdfs dfs -put ~/pig/employee_details.txt hdfs://localhost:9000/pig_data/ grunt> student_details = LOAD 'hdfs://localhost:9000/pig_data/student_details.txt' USING PigStorage(',') as (id:int, firstname:chararray, lastname:chararray, age:int, phone:chararray, city:chararray); grunt> employee_details = LOAD 'hdfs://localhost:9000/pig_data/employee_details.txt' USING PigStorage(',') as (id:int, name:chararray, age:int, city:chararray); grunt> cogroup_data = COGROUP student_details by age, employee_details by age; grunt> Dump cogroup_data;