mysql - Collecting multiple columns of aggregate data with a join -
i'm trying figure out if query i'd @ doable or feasible in sql or if need collect raw data , process in application.
my schema looks this:
applications ================ id int application_steps ================= id int application_id int step_id int activated_at date completed_at date steps ===== id int step_type_id int
ideally, data in application_steps
:
| id | application_id | step_id | activated_at | completed_at | | 1 | 1 | 1 | 2013-01-01 | 2013-01-02 | | 2 | 1 | 2 | 2013-01-02 | 2013-01-02 | | 3 | 1 | 3 | 2013-01-02 | 2013-01-10 | | 4 | 1 | 4 | 2013-01-10 | 2013-01-11 | | 5 | 2 | 1 | 2013-02-02 | 2013-02-02 | | 6 | 2 | 2 | 2013-02-02 | 2013-02-07 | | 7 | 2 | 4 | 2013-02-09 | 2013-02-11 |
i want result:
| application_id | step_1_days | step_2_days | step_3_days | step_4_days | | 1 | 1 | 0 | 8 | 1 | | 2 | 0 | 5 | null | 2 |
note in reality there many more steps , many more applications looking at.
as can see, there has-many relation between applications
, application_steps
. possible given step not in use particular application. i'd amount of time each step takes (using datediff(completed_at, activated_at)
), in 1 row (the column names don't matter). @ possible?
secondary question: complicate things bit further, need secondary query joins application_steps
steps
, gets data steps particular step_type_id
. assuming part 1 possible, how can extend filter efficiently?
note: efficiency key here - yearly report, equates 2500 applications
70 different steps
, 44,000 application_steps
in production (not lot of data, potentially lot when joins factored in).
this should basic "pivoting" aggregation:
select id, max(case when step_id = 1 datediff(completed_at, activated_at) end) step_1_days, max(case when step_id = 2 datediff(completed_at, activated_at) end) step_2_days, max(case when step_id = 3 datediff(completed_at, activated_at) end) step_3_days, max(case when step_id = 4 datediff(completed_at, activated_at) end) step_4_days application_steps s group id;
you have repeat 70 steps.
to particular type of step:
select application_id, max(case when step_id = 1 datediff(completed_at, activated_at) end) step_1_days, max(case when step_id = 2 datediff(completed_at, activated_at) end) step_2_days, max(case when step_id = 3 datediff(completed_at, activated_at) end) step_3_days, max(case when step_id = 4 datediff(completed_at, activated_at) end) step_4_days application_steps s join steps on s.step_id = steps.id , steps.step_type_id = xxx group application_id;
Comments
Post a Comment