DocEng2013 Bilauca Healy - Splitting Wide Tables Optimally
-
Upload
mbilauca -
Category
Technology
-
view
73 -
download
0
description
Transcript of DocEng2013 Bilauca Healy - Splitting Wide Tables Optimally
Splitting Wide Tables Optimally
Mihai Bilauca Patrick Healy
DocEng2013, September 10– 13, 2013, Florence, Italy
Department of Computer Science and Information Systems University of Limerick, Ireland
Supported by Science Foundation Ireland under the research programme 01/P1.2/C009, Mathematical Foundations, Practical Notations, and Tools for Reliable Flexible Software.
Splitting Wide Tables Optimally
Splitting Wide Tables Optimally
Why this paper?
• Tables are widely used for presenting logical
relationships between data items;
• Widely spread WYSIWYG tools have poor support for
wide tables;
• Authoring tables is hard, time consuming and error
prone;
• Style manuals recommendations are not always
supported
• Very little research in this area
Slide 2 of 23
Splitting Wide Tables Optimally
A wide table split across multiple pages
Slide 3 of 23
Splitting Wide Tables Optimally
Grouping of data items increases readability
+ Zoom in
Slide 4 of 23
Splitting Wide Tables Optimally
Splitting Wide Tables Optimally
Style recommendations from Chicago Manual of Style
“For a two-page broadside table – which should be presented on facing pages if at all possible – column heads need not be repeated; for broadside tables that run beyond two pages, column heads are repeated only on each new verso.
Where column heads are repeated, the table number and “continued” should also appear.
For any table that is likely to run to more than one page, the editor should specify whether continued lines and repeated column heads will be needed and where footnotes should appear (usually at the end of the table as a whole).”
Slide 5 of 23
Splitting Wide Tables Optimally
Splitting Wide Tables Optimally
Overview
We present MIP Solutions using OPL for 3 problems that occur
when splitting wide tables with the aim to minimize the effect
on the meaning of data:1. Minimize Page Count2. Minimize Page Count and Column Positioning
Changes 3. Minimize Page Count and Group Splitting
Report experimental results with IBM CPLEX 12.3
Conclusions
MIP – Mixed Integer Programming
OPL – Optimization Programming Language
Slide 6 of 23
1.Minimum Page Count
Splitting Wide Tables Optimally Slide 7 of 23
1.Minimum Page Count – OPL Model
dvar int+ pageSel[Pages] in 0..1; dvar int+ X[Pages][Cols] in 0..1;
dexpr int pageCount = sum(p in Pages) pageSel[p];
minimize pageCount;
subject to{ ct1: // select only one page for each column forall(j in Cols) sum(p in Pages) X[p][j] == 1;
ct2: // only columns that fit in the page forall(p in Pages) sum(j in Cols) colW[j] / pageW ∗ X[p][j] <= pageSel[p];
}Splitting Wide Tables Optimally Slide 8 of 23
1.Minimum Page Count - Results
Building Table Formatting Tools
● Page count can be reduced by 14% to 25%
● The difficulty of the problem is not directly linked to the
problem size but to the data itself
Columns 10 20 30 40 50 60
PC 7 16 19 29 34 48
OPC 6 12 15 23 26 39
%Imp 14.28% 25.00% 21.05% 20.68% 23.52% 18.75%
Time 2.25 0.13 0.17 1.18 04.30 1.52
Slide 9 of 23
2.Minimum Page Count & Column Positioning Changes
Splitting Wide Tables Optimally Slide 10 of 23
2.Minimum Page Count & Column Positioning Changes
PageW: 490 points colW : [210, 140, 210, 420, 280, 350, 70, 140, 140, 350]7 pages : {210,140} {210} {420} {280} {350,70} {140,140}
{350}
Minimum 5 pages: ColIdx : [1, 7, 8, 5, 2, 9, 6, 10, 3, 4]Pages: {210,280} {140,350} {420,70} {140,210} {350,140}
Minimum 5 pages and column position changes possDiffcolIdx : [1, 2, 3, 5, 4, 7, 6, 8, 9, 10]Pages : {210,140} {210,280} {420,70} {350,140} {140,350}
Splitting Wide Tables Optimally Slide 11 of 23
2.Minimum Page Count & Column Positioning Changes
Splitting Wide Tables Optimally
dvar int+ pageSel[Pages] in 0..1; dvar int+ pageIdx[Cols] in 0..1;dvar int+ colIdx[Cols] in 0..1;
// check if j1 is placed on a page before j2dexpr int posO[j1,j2 in Cols] = j1 <= j2−1;
dexpr int posN[j1,j2 in Cols] = (colIdx[j1]<=colIdx[j2]−1)
dexpr float posDiff = sum(j1,j2 in Cols : j2 < j1)abs(posO[j1,j2] − posN[j1,j2]);
dexpr int pageCount = sum(p in Pages) pageSel[p];
// a, b, obj1Val variables are used for OPL flow controlminimize a * pageCount + b * posDiff;
Slide 12 of 23
2.Minimum Page Count & Column Positioning Changes
Splitting Wide Tables Optimally
subject to {ct1: // do not exceed page width forall(p in Pages) sum(j in Cols) colW[j]/(p==pageIdx[j]) / pageW <= pageSel[p]; ct2: // page and column indexes relationship forall(ordered j1,j2 in Cols) (pageIdx[j1]<=pageIdx[j2]-1) - (colIdx[j1]<=colIdx[j2]-1) == 0;ct3: // unique column index values forall(ordered j1,j2 in Cols) colIdx[j1]!=colIdx[j2];// if the minimum page count obj1Val is set// maintain this value for subsequent searchesct4: if (obj1Val >= 0 ) pageCount == obj1Val;}
Slide 13 of 23
2.Minimum Page Count & Column Positioning Changes
Building Table Formatting Tools
Results
● Promising performance:
– 2.25s for minimizing a 10 column table with posDiff 33 down to 4, page count from 9 down to 8;
– 89s for minimizing a 20 column table with posDiff 194 down to 4, page count from 13 down to 11;
● Computational time increases with columns number ● The data instance can have no better solutions
Slide 14 of 23
3.Minimum Page Count & Group Splitting
Splitting Wide Tables Optimally Slide 15 of 23
3.Minimum Page Count & Group Splitting
User specifies which columns should preferably be kept together
PageW: 490 points colW : [210, 140, 210, 420, 280, 350, 70, 140, 140, 350]7 pages: {210,140} {210} {420} {280} {350,70} {140,140}
{350}
Minimum 5 pages: ColIdx:[3, 5, 4, 7, 10, 6, 8, 1, 2, 9]Pages: {210,280} {420} {70,350} {350,140} {210,140,140}
Group columns 2,3 and 7:colIdx:[2, 3, 7, 4, 9, 10, 6, 8, 1, 5]Pages :{140,210,70} {420} {140,350} {350,140} {210,280}
Splitting Wide Tables Optimally Slide 16 of 23
3.Minimum Page Count & Group Splitting
Splitting Wide Tables Optimally
int colG[Cols] = ...;// column groupsdvar int+ pageSel[Pages] in 0..1; dvar int+ pageIdx[Cols] in 0..1;
// find the first column of the groupint gFirstCol[g in groups] =
first({j | j in Cols : colG[j] == g});
// counts how many columns of a group are on a// different page than the first group’s columndexpr int gSplit[g in groups ] =
sum(j in Cols : colG[j] == g )(pageIdx[j] != pageIdx[gFirstCol[g]]);
dexpr int gSplitCount = sum(g in groups)(gSplit[g] >= 1 );
dexpr int pageCount = sum(p in Pages) pageSel[p];
Slide 17 of 23
3.Minimum Page Count & Group Splitting
Splitting Wide Tables Optimally
// a, b, obj1Val variables are used for OPL flow controlminimize a * pageCount + b * posDiff;
subject to {ct1: // do not exceed page width forall(p in Pages) sum(j in Cols) colW[j] * (p==pageIdx[j])/ pageW <= pageSel[p];
// if the minimum page count obj1Val is set// maintain this value for subsequent searchesct2: if (obj1Val >= 0 ) pageCount == obj1Val;}
Slide 18 of 23
3.Minimum Page Count & Group Splitting Model
Building Table Formatting Tools
Results
● Promising performance:● 1m for a 20 column table with 3 groups, none
split, page count from 12 down to 9;● 2m for 30-40 column tables but time increased
up to 12m when the number of groups increased;
● Computational time increases with columns and
groups number
● Some relaxed solutions can be prefferedSlide 19 of 23
Conclusions
Splitting Wide Tables Optimally Slide 20 of 23
Conclusions
• Optimal arrangement of columns such that the page count is minimized when splitting wide tables can be achieved in relatively short running time; for tables with 60 columns a solution has been found in less than 2s;
• If additional criteria are added, for example minimizing the number of relative column positions changes,the problems become harder as the number of columns increase;
• the difficulty of the problems not only depends on the problem size but on the complexity of the data;
Splitting Wide Tables Optimally Slide 21 of 23
Ongoing work
Minimizing the overall page count when a large table containing text is displayed on fixed size pages and neither column widths nor row heights are known in advance.
Splitting Wide Tables Optimally Slide 22 of 23
Thank you!
www.tabularlayout.org
Splitting Wide Tables Optimally Slide 23 of 23