<% meth="aggregate_column_data" %>

<div class="lvl1">
	<p> We’ll use this in a table that contains a reference column with repeated values (i.e., an ID) to aggregate values from another column or columns based on that reference. In all instances, we will use the --aggregate flag.   </p>

	<h3> Basic example with a 2-column table, one with the ID and another one with values </h3>

		<p> In this case, values from the second column will be aggregated based on the values from the first column. To do that, the -i flag references the input table, the --agg_ref_col_index flag is the column we want to use as reference (base 1) and the --agg_col flag is the column we want to aggregate (base 1). Moreover, with the --agg_sep flag, we can choose the separator we want in our aggregated column. </p>
		<%
			table_name = 'cluster_genes_dis_desagg_3cols'
			args = f"-i tables/{table_name} --aggregate --agg_ref_col_index 1 --agg_sep ',' --agg_col 2"
		%>
		${show_n_exec(table_name, args, "cmdtabs")}

	<h3> Example with a 3-column table, one being the ID and the other two corresponding to values. </h3>

		<p> In this case, values from the second and third columns will be aggregated based on the first one. </p>
		<%
			table_name = 'cluster_genes_dis_desagg_3cols'
			args = f"-i tables/{table_name} --aggregate --agg_ref_col_index 1 --agg_sep ',' --agg_col 2,3"
		%>
		${show_n_exec(table_name, args, "cmdtabs")}

	<h3> Having 2 columns as reference. </h3>

		<p> If we have two columns as reference and we don't want to aggregate them but we also don't want to lose them, we can use the --agg_ref_col_index flag to select more than one column as reference (as well as selecting more than one aggregation column). </p>

		<%
			table_name = 'cluster_genes_dis_desagg_4cols'
			args = f"-i tables/{table_name} --aggregate --agg_ref_col_index 1,2 --agg_col 3,4"
		%>
		${show_n_exec(table_name, args, "cmdtabs")}

	<h3> Example with a 3-column table, one being the ID and the other two corresponding to value, and another aggregation function. </h3>	
		<p> Besides concatenating values in our column(s) of interest (default mode), we can also use different methods of aggregation using the --agg_mode flag, such as calculating the average ("mean"), median ("median"), maximum ("max") and minimum values ("min"), sum ("sum"), standard deviation ("std"), variance ("var"), interquartile range ("IQR"), 25th percentile ("P25"), 75th percentile ("P75") as well as counts ("count").  </p>
		<p> Let's see and example with the average of values from columns 2 and 3 based on column 1. </p>
		<%
			table_name = 'cluster_genes_dis_desagg_3cols'
			args = f"-i tables/{table_name} --aggregate --agg_ref_col_index 1 --agg_col 2,3 --agg_mode mean"
		%>
		${show_n_exec(table_name, args, "cmdtabs")}

	<h3> Example of aggregation of counts. </h3>
	
		<p> Useful to obtain a table with the ID in one column and the number of times said ID appears in the table when it was desaggregated in the other column. </p>
		<%
			table_name = 'cluster_genes_dis_desagg_3cols'
			args = f"-i tables/{table_name} --aggregate --agg_ref_col_index 1 --agg_col 2 --agg_mode count"
		%>
		${show_n_exec(table_name, args, "cmdtabs")}

</div>