Generating creative ideas through crowds: An experimental study of ...

Thirty Second International Conference on Information Systems, Shanghai 2011 1

Generating Creative Ideas Through Crowds:

An experimental study of combination

Completed Research Paper

Lixiu Yu Stevens Institute of Technology

Hoboken, NJ [email protected]

Jeffrey V. Nickerson Stevens Institute of Technology

Hoboken, NJ [email protected]

Abstract

The crowd is emerging as a new source of innovation, and here a new way of organizing the crowd to produce new ideas is discussed: an idea generation system using combination in which participants synthesize new designs from the efforts of their peers. A crowd generates designs; then another crowd combines the designs of the previous crowd. In an experiment with 540 participants, the combined designs are compared to the initial designs, and to a control condition in which fresh idea generation rather than combination is used. The results show that designs become more creative in later generations of the combination system, and the combination produces more creative ideas than the fresh idea generation. The model of crowdsourced idea generation discussed here may be used to instantiate systems that can be applied to a wide range of design problems. The work has pragmatic implications, and also theoretical implications: new forms of coordination are now possible, and, using the crowd, it is possible to build and test existing and emerging theories of coordination and design.

Keywords: Crowdsourcing, design, combination, creativity, idea generation, innovation

Online Communities and Digital Collaborations

2 Thirty Second International Conference on Information Systems, Shanghai 2011

Introduction

Because of the continuous evolution of web technologies and the emergence of crowdsourcing marketplaces, individuals can be brought together in nominal groups to perform collective tasks (Benkler 2006; Crowston and Howison, in press). Indeed, crowdsourcing can be defined as the assembly of a set of people to accomplish a task through an open call (Howe 2006). Since there are now many examples of crowdsourcing, the collective activities these crowds perform can be classified into two basic categories: those focused on creation, and those focused on decision (Malone 2009). The first application category, creation, is a particularly attractive to companies, because innovation is important for firm survival and high rates of internal innovation are hard to sustain (Chesbrough and Vanhaverbeke 2006). But crowdsourcing demands organization, and these organizational forms may have many of the problems of more structured online communities (cf. Wasko and Faraj 2005).

Can we apply techniques from past studies of coordination and collaboration to organize crowd idea generation? A longstanding topic in information systems has been team-based idea generation: brainstorming, and its electronic equivalent (Osborn 1957; Dennis et al. 1996; Diehl and Stroebe 1987; Mullen et al. 1991). These past studies may inform us, but crowdsourcing presents different challenges. Unlike team-based idea generation processes, the crowd is distributed around the world. They may perform tasks collectively but they are a nominal group: they don't identify themselves as team members. Members of the crowd often lack expertise; they work for free or for a small amount of money on small tasks: online crowdsourcing marketplaces, such as Amazon Mechanical Turk and Crowdflower, have as members millions of such workers (Kittur et al. 2008; Ross et al. 2010; Biewald and Allick 2011). Therefore, crowd idea generation entails new collaboration mechanisms.

Our search for such mechanisms can be informed by other crowdsourcing work, most of which focused on decision or processing tasks, rather than idea generation tasks. A number of researchers have studied ways to attract and organize the crowd, including peer production and the use of online contests (Leimeister et al. 2009; Malone 2009; Piller 2010; Quinn and Bederson 2011). One of the streams of research explores the ways that crowds can be organized to perform collective tasks, mediated through technologies and outputs, minimizing extraneous social interaction. There are many examples: individuals play a shared game task (von Ahn and Dabbish 2004); an editing task is broken down into small pieces and assigned to individuals (Bernstein et al. 2010); individuals work on the same translation task in an iterative process (Bederson et al. 2010).

How does coordination occur in such situations? In many cases, the interactions are asynchronous, and so the tasks are coordinated through their digital representations. In the spirit of such task-mediated interaction mechanisms, we create a crowd idea generation process based on evolutionary computation. Evolutionary computation is based on a biological metaphor that includes parents, children, combination, random variation, and natural selection. Solutions are seen as populations that evolve over multiple generations. An optimal solution can evolve computationally through random variation and/or a combination of existing solutions. In the implementation of such processes using the crowd, there is a critical decision point: combination is chosen or not. If combination is chosen, the crowd will interact by combining each other’s ideas. This is consistent with many theories from management and psychology. Specifically, there is a common claim made in management literature that innovation is at its heart the recombination of existing knowledge (Fleming 2001; Henderson and Clark 1990; Nelson and Winter 1982; Olsson and Frey 2002). Researchers in the field of psychology also claim that creativity results from a combination process (Mumford et al. 1991; Simonton 2003; Thagard 1992). But the issue is not settled; there are counterarguments to these theories. Nevertheless, we will predict that crowd idea generation using combination will produce creative ideas, and we will conduct experiments to test this prediction.

This paper makes both theoretical and practical contributions. With respect to theory, the idea that combination improves the quality of ideas has been long proposed but seldom studied. As part of a broad research program, we have built a combination system to operationalize this difficult-to-pin-down idea, and experimentally test it (Nickerson and Sakamoto 2010; Nickerson, Yu and Sakamoto 2011; Yu and Nickerson 2011). The experiment reported here extends our research by providing a control condition: by comparing the proposed system to a non-combinatoric system in a controlled experiment, we provide clear evidence of the effectiveness of the system mechanism: This is the central differentiating contribution of this paper. In addition, we apply the system to a different design problem, demonstrating

Yu & Nickerson: Generating Creative Ideas Through Crowds


that the system can work across a range of problems. More generally, this work – both the system and its use as an apparatus in controlled experiments - provides a model upon which many theories related to idea generation, creativity, and innovation may be built, refined and tested. With respect to pragmatics, we describe a way of structuring crowds that allows for design collaboration with very little social interaction. This method can in principle be applied to a wide range of design domains, and has practical benefits because it allows for massive parallelism, increasing the diversity of the idea pool and reducing the time to produce new ideas.

In the following sections of the paper, we provide details of our method and report the results of an experiment. First, we will introduce in more detail the theories behind our study and develop our hypotheses.

Background

Innovation is broadly seen as a process that has many steps, from initial problem definition to eventual implementation (Eveleens 2010; Stevens and Burley 1997). We will in this work focus on the early stages of innovation, those that involve idea generation and the assessment of these initial ideas. There are a lot of studies addressing these issues: Brainstorming, electronic brainstorming and nominal group idea generation have been longstanding topics in the information systems literature (Osborn 1957; Diehl and Stroebe, 1987; Nunamaker, Dennis, Valacich, Vogel and George 1991; Mullen, Johnson, and Salas, 1991; Paulus, Larey, and Ortega, 1995). In particular, the notion that we can improve ideas by combining them goes back at least to the team-based idea generation literature of the 1950s (Osborn 1957). When Osborn popularized brainstorming, one of his four rules was “combine and improve ideas”. In recent years, idea combination has re-emerged as a topic of focused research (Litchfield 2008). But it is not clear whether combination per se produces creative ideas.

The combination process, as implemented by the crowd, might be ineffective, or even destructive, for three reasons. First, designers or problem solvers tend to fixate: they get stuck on the ideas that are not promising (Smith, Ward, and Schumacher 1993; Tversky and Chou 2010). Therefore, it is possible that designers will be pre-occupied by the ideas to be combined and become less likely to come up with truly new ideas. Second, past experimental studies have found that people are more likely to remember and apply common ideas that they are exposed to (Putman and Paulus 2009; Rietzschel et al. 2006). By definition, creative ideas require both usefulness and novelty (Amabile 1996). But if the crowd attends more to the common features of the given ideas, the resulting combinations are less likely to have novelty. Third, the crowd usually lacks expertise in the task domain, and may be performing a task just for money: They may be neither intrinsically motivated nor capable, so they may degrade idea quality during the combination process.

Despite the above arguments, other literature from management, psychology and artificial intelligence provides us with evidence that the combination process is constructive, and leads to creative ideas. There is a well-established literature in management and psychology to support the idea that combination is generative. In the product development literature, knowledge is thought of as being dispersed among different individuals and entities (Hayek 1945). Innovation is described as the recombination of existing dispersed bits of knowledge (Henderson and Clark 1990; Nelson and Winter 1982). In particular, this view has been applied to the study of technological change and growth (Fleming 2001; Olsson and Frey 2002). Some researchers discuss how the combination of knowledge affects an individual’s innovation activity (Taylor and Greve 2006), but most management researchers focus on the macro-level, describing how knowledge recombination in general affects firms’ innovation (Buckley and Carter 2004; Tolstoy 2009).

In contrast, psychologists focus primarily on the individual mental process of creativity. Creativity is often said to be combinatorial: creative thought results from combination. Examples include the invention of a new technology, the discovery of a new theory and the creation of an art piece. Thagard argues that all creativity results from the combination of mental representations, visual or verbal (Thagard 1992). In a similar vein, Simonton claims that creativity involves the generation of a chance combination. To find fruitful combinations, it is necessary to construct the very numerous possible combinations, among which the useful ones are to be found (Simonton 2004). Stated differently, creativity involves making unfamiliar combinations of familiar ideas (Boden 1996). Constraints play a role in creativity: there is evidence that



they can be conducive to creativity (Finke et al. 1996), and we might regard the instruction to combine ideas as enforcing a process constraint on the designer. Empirical studies show how idea combination can stimulate collective creativity (Hargadon and Bechky 2006). Recent experimental research suggests that idea combination can improve both the quality and quantity of idea generation (Kohn et al. 2011).

How, though, can we test as amorphous a concept as combination? Based on previous work in evolutionary algorithms (Deb 2001; Gero and Maher 1994; Goldberg 1989; Kosorukoff 2001), we can construct a replicable idea generation process. This process works like a genetic algorithm, except the work is done by humans, not computers. The algorithm will be explained in more detail in the methods section. In brief, humans create the generation 1 ideas. Then a choice is possible in the following generations. Members of the crowd in the following generations can combine the ideas that the previous crowds created. Or members of the crowd can continue to generate ideas anew in each generation, without reference to the previous ideas.

Given this apparatus, we can test the efficacy of combination in two ways. First, we seek to establish that combination will improve ideas if run through multiple generations:

Improvement Hypothesis: An idea generation system using combination will produce more creative ideas in later generations than the initial generation.

This is easy to disprove: if the ideas don’t get more creative, the hypothesis is wrong.

Second, we compare combination to a simpler condition, in which ideas are created without reference to previous crowd-generated ideas. The literature of genetic algorithms is replete with arguments related to efficacy of crossover (combination) versus mutation, in which ideas are not combined (Spears 1995). In many cases finding optimal solutions can be most efficiently accomplished through random search, either through genetic mutation or another optimization process called simulated annealing, based on a thermodynamic conception of randomness (Davis 1987). In sum, it might be best to generate as many diverse ideas as possible, and then find the best, rather than combine existing ideas. This last argument suggests that a fair comparison to combination might be independent idea generation, accomplished by letting a crowd generate new ideas rather than combine old ones: these new and fresh ideas may more thoroughly explore the space and increase the likelihood of a truly unusual design emerging. Because there are arguments in favor of combination and in favor of independent idea generation, the following hypothesis is important to test:

Comparison Hypothesis: An idea generation system using combination will produce more creative ideas than an idea generation system without combination.

Method

In describing the method, we will first describe the problem definition, and then describe the construction of the apparatus – the idea generation system. Next, creativity measurement will be discussed. Then the experiment design will be explained.

Problem Definition

We used a design problem that both novices and experts could accomplish. We considered presenting the crowd with the problem of the design of an information system, but did not think that a general crowd would possess the skills to design and assess one. (Note that the information systems artifact in this study is the combination system itself, not the output of the system). We chose a design problem that has been the focus of previous creativity research: the design of a clock (Goldschmidt and Litan Sever 2009). Clocks are objects that are universally understood, so designers and evaluators will have common ground, but clocks also allow for many design variations. In the experiment, in order to introduce more requirements complexity, we require the design to be for an alarm clock. Participants were asked to communicate their design ideas using both sketches and text. Sketches provide a rich way to study design: they are often the means by which conceptual designs are developed and shared with others (Tversky and



Chou 2010). Texts are used to explain the functionality of the design, and the explanatory combination of text and image produces an interpreted image Gooding calls a visual representation (2010).

The Experimental Apparatus

A crowdsourcing marketplace, design software and an organizational process together were merged to form the idea generation system. Specifically, the technology was an integration of the Amazon Mechanical Turk platform (Amazon Mechanical Turk 2010) and the Google Docs drawing platform (Google Docs 2010). The organizational process was based on a genetic algorithm (Goldberg 1989; Holland 1975) implemented with human participants (Kosorukoff 2001). Thus, this paper’s method can be seen as an instance of design science (Hevner et al. 2004): an information systems artifact, the idea generation system, was built and then tested. The system is described in enough detail that it can be replicated, and the tests were designed to compare one idea generation technique against an alternative. First, we provide more detail on the crowdsourcing marketplace and the drawing platform. Then we discuss the algorithm. There are two types of users in Amazon Mechanical Turk: requesters and workers. Requesters post their tasks by creating HITs (Human Intelligence Tasks) and workers take on these tasks in return for money. Previous studies have described the characteristics of Mechanical Turk workers (Kittur et al 2008; Ross et al. 2010). In our study, the solicitation and management of participants was handled through this crowdsourcing market. All participants received compensation for either designing clocks or evaluating clocks. The platform through which participants produce designs and combine designs is the Google Docs drawing tool. The drawing tool provides menu choices such as a freehand sketch option, a vector line, text, and a pull-down shape palette. When participants engaged in idea generation, they were directed to a Google document page already opened as a drawing. This same technique was used in later generations to present the designs to be combined. Participants had access to all the features of the drawing tool. In the combination tasks, the presented sketches were rasterized, so that participants were required to generate anew features they saw in the images.

In order to clearly specify the idea generation system using combination, we first need to describe the way a genetic algorithm works; more detail can be found in (Yu and Nickerson 2011). Such an algorithm starts with a first generation population, and performs a fitness ranking. Then, members of the population are selected to become parents of the next generation. Most often, tournament selection is used: two parents are selected at random, and the fitter chosen (Goldberg 1989). Another two parents are chosen, and the fitter chosen. The two chosen parents then produce offspring through a combination procedure. These offspring serve as a new population, which is then ranked, and the process repeats. Because combination can sometimes take the worst features of highly ranked parents, and thereby degrade the available genetic pool, a set of the strongest parents are moved into the next generation without change, an attribute of the algorithm referred to as “elitism” (Deb 2001).

The idea generation system using combination was based on the above description, and designed to run for three generations, as summarized in Figure 1.



Figure 1. The Generations of the Experiment

The Measurement of Creativity

Innovation is often described as the successful implementation of creative ideas (e.g. Amabile 1996; George 2007). As we mentioned in the previous section, we focus on the generation of the creative ideas in the current study, as it is infeasible to manufacture and test the generated ideas. A creative idea should be both novel and potentially useful (Amabile 1996; 1998). Operationally, this leads to a binary measure of creativity: Creative designs are designs that exceed a certain threshold on both the scales of originality and practicality (Finke 1990; Finke et al. 1996). Thus in our study, the designs are evaluated on these two scales (as part of our research program, we have also explored alternative ways of rating designs (Bao, Sakamoto and Nickerson, 2011; Sakamoto and Bao 2011)). The designs will be then further classified as being creative or not based on the two scales. The measurement of creativity will be further described in the following experiment design section.

The Experiment Design

The experiment is designed to show if combination is effective in producing creative designs. The effect is tested in two ways, as illustrated in Figure 2.

In the experimental condition, alternatively called the combination condition, the system was implemented and run for three generations. In the first generation, a crowd was asked to produce designs and, in the next two generations, different crowds were asked to produce new designs by combining the given designs from the previous generations. To test the improvement hypothesis, the combined designs from the last generation are compared to the initial designs from the first generation with respect to creativity. We predict there will be more creative designs in the last generation than in the first generation.

A parallel control condition was designed to provide a reference comparison for the combination system. The idea generation system was run, starting with the same first generation designs as the combination condition. In the following two generations, participants were asked to design clocks independently, without using any combination process. Because the control condition and the experimental condition share the same first generation, the creativity of the designs from the last two generations in the control condition is compared to the creativity of the designs from the last two generations in the combination condition (we could have compared all designs generated in each condition, but since both conditions share exactly the same set of seed designs, these designs will not make a difference in the end). This tests the comparison hypothesis, that combination is better than independent idea generation. We predict that



more creative designs will be produced in the combination condition than in the control condition. These two conditions were run simultaneously online, and participants were randomly assigned to one or the other condition. Below, we elaborate on how the experiment was designed and implemented.

Figure 2. The Experiment Design

Generation 1 design for both conditions: Sixty randomly selected participants were instructed to generate the population of the first generation:

Design an alarm clock 1. The clock should be easy and safe to use. In addition, it should be inexpensive to manufacture. 2. Please elaborate your design as much as possible. Draw multiple sketches to represent the different sides of the clock if necessary.

The first requirement asks participants to take several design constraints into account: ease of use, safety and cost. These constraints were introduced in order to make the problem realistic, as well as provide evaluators with criteria for judging the practicality of the designs. The second requirement is intended to motivate the participants to create visual representations of their design scheme.

After participants finished their sketches, a question was asked to allow participants to explain the design ideas:

Please explain your design and describe the functionality.

Generation 1 evaluation for the combination condition: Seventy-five different participants then evaluated the designs generated by generation 1. Participants were asked to rate the designs on practicality and originality based on seven point Likert scales. Below, we show the instruction; Figure 3 shows an example of the rating interface.

Your fellow workers were asked to design alarm clocks in response to the following request. Design an alarm clock 1. The clock should be easy and safe to use. In addition, it should be inexpensive to manufacture. 2. Please elaborate your design as much as possible. Draw multiple sketches to represent the different sides of the clock if necessary. Each design has two parts: sketches (shown on the left) and text (shown on the right). Please evaluate the originality and the practicality of each design based on the given scales.

Generation 1

Sixty produced designs

Generation 2

Control condition Experimental condtion

Forty-five new combined designsForty-five new produced designs

Generation 3 Generation 3

Forty-five new produced designs Forty-five new combined designs



My design would be simple. There are a total of six buttons that would account for all the functionality of the clock. I’m of the KISS philosophy and would design as such.

The single button on the top of the clock would be akin to the left click button of the mouse while the arrows would allow for multiple options.

Figure 3. An example of the evaluation interface: the raters see a design above,

and a set of Likert scales below.

Generation 2 for the combination condition: The 60 designs produced in generation 1 were ranked in terms of creativity. The creativity score was calculated by averaging the scores of practicality and originality. Tournament selection was used to select 45 pairs of generation-1 designs to serves as "parents" for generation 2. Participants combined the pairs following the instructions below:

The designs on the right are from your fellow workers. Please create a new alarm clock by combining aspects of the two clocks shown. 1. The clock should be easy and safe to use. In addition, it should be inexpensive to manufacture. 2. Please elaborate your design as much as possible. Draw multiple sketches to represent the different sides of the clock if necessary.

By applying elitism (Deb, 2001), the 15 highest-rated clocks from generation 1 were automatically promoted to generation 2. Therefore, there are 60 designs in generation 2. While elitism is considered an important aspect of modern genetic algorithms, in the later analysis we will show that increases in creativity are not due to its use. That is, we will test to show that creativity increases even without the effect of this common technique.

Generation 2 for the control condition: Forty-five designs were collected simultaneously following the same procedure as in generation 1.

Generation 2 evaluation for the combination condition: Eighty-nine different participants, randomly selected, evaluated the designs generated by generation 2 following the same procedure of evaluating designs from generation 1.

Generation 3 for the combination condition: The 60 designs in generation 2 were ranked on creativity. Tournament selection was used to select 45 pairs of generation-2 designs to serves as parents for generation 3. Forty-five randomly selected participants combined the pairs as in generation 2.

Generation 3 for the control condition: Forty-five designs were collected simultaneously following the same procedure as in generation 1.

All design evaluations for both conditions: Altogether 136 participants involved in evaluating overall 240 designs from the two conditions in the same way the designs in generation 1 were evaluated. Each design was rated by 15 participants.

The procedure described above is summarized in Table 1.



Table 1. Summary of the experiment design

The control condition The experiment condition

Generation 1 for both conditions: Sixty randomly selected participants were instructed to generate the population of the first generation.

No evaluation necessary Generation 1 evaluation

Generation 2:Forty-five designs were collected simultaneously following the same procedure as in generation 1.

Generation 2: 45 pairs of generation-1 designs are selected as parents to be combined to produce generation 2 designs.

No evaluation necessary Generation 2 evaluation

Generation 3: Forty-five designs were collected simultaneously following the same procedure as in generation 1.

Generation 3: 45 pairs of generation-2 designs are selected as parents to be combined to produce generation 3 designs.

All design evaluations for both conditions: Altogether 136 participants involved in evaluating overall 240 designs from the two conditions in the same way the designs in generation 1 were evaluated.

Results

Table 2. Examples of clocks produced by the participants

Sketch Explanation

My design would be simple. There are a total of six buttons that would account for all the functionality of the clock. I'm of the KISS philosophy and would design as such. The single button on the top of the clock would be akin to the left click button of the mouse while the arrows would allow for multiple options.

The design is very minimal, with the main elements of the clock functions as the largest pieces. It is a basic rectangle with the large side as the clock display. The LED display makes up practically the entire side of the clock, as it is the main feature. The 2 sides have hole perforations to allow the sound of the alarm to ring loudly from the inside of the clock. The top is where the user can change the clock settings and stop the alarm. The large round dial is where the user can change the time, with a 3 position switch next to it to adjust the regular time, the alarm time, and a dial lock that disables the large dial. right next to it is a large rectangular button that takes up half of the top. This is to turn off the alarm when it goes off. Holding it down for 5 seconds will set it to alarm again in 15 minutes. It is easy to use as all the functions are on top and very easy to learn. It uses a numeral LED display so that it is easy to tell the time right away. It is safe in that it has a simple design and should be sturdy.

Two hours in one. Blue hands show the time, red hands showing the time when the alarm includes (open and manually adjusted). Clock attaches to the wall. Alarm goes off and stops by dragging the lower arm.

Among the 540 participants, 42% were female, 61% were native English speakers and 66% had earned college or graduate degrees. Ages ranged from 18 to 79 with a mean age of 29. Overall, 240 designs were collected. Ninety percent of the designs consisted of two pieces of information: sketches and text



explanation. The sketches varied in many ways, including the level of detail and the specific nature of lines and shapes. Participants made use of the menu choices of the Google Docs drawing tool to draw the clocks—some sketched freehand, and others started with shape stencils. Some sketches portray everyday clocks. Others depict unusual clocks. Several clocks are shown in Table 2. Are designs from the last generation more creative than the first generation? Participants had been asked to rate the practicality and originality of both generation 1 and generation 3 designs. Because the designs were rated by different sets of raters, we used the rating reliability calculator developed by Solomon to assess the interrater reliability (Ebel 1951; Solomon 2004). The interrater reliability is 0.75 for practicality and 0.77 for originality, based on every design being rated by 15 raters. Figure 4 shows a scatter plot of generation 1 and generation 3 designs with respect to originality and practicality. From the figure, we can see that generation-3 designs tend to be shifted toward the more original and practical ends of the spectrum than generation-1 designs.

Figure 4. The Originality and Practicality of All Designs from Generation 1 and Generation 3

As discussed in method section, a binary measurement was used to judge the creativity of the designs: only designs that exceed a certain threshold on both the scales of originality and practicality qualify as creative designs, using the method of Finke et al. (1996). Consistent with their approach, we chose the approximate mean value of the ratings across all designs, 4.5 as the threshold of practicality and 4.0 as the threshold of originality. As a result, 36 out 60 designs in generation 3 were classified as creative designs, in contrast to 15 out of 60 in generation 1. This is shown in Figure 5.

The proportions are significantly different (x2(1, N=120)=13.64, P<0.01). The difference is not an artifact of elitism: only three of the 15 elite clocks from generation 1 are still present in generation 3, and, if they are removed, the difference between the generations remains significant (x2(1, N=117)=11.75, P<0.01). Thus, the improvement hypothesis, that creativity will increase in later generations, is supported. We next compare the creativity of designs across conditions. Both the control condition and combination condition start from the same initial designs in generation 1. Differences begin in generation 2: there are 90 new combined designs in the last two generations in the combination condition and 90 new produced designs in the control condition. Thus, we compared the creativity of the last 90 designs in the two conditions. There were 45 creative designs in the combination condition and 26 creative designs in the control condition. The proportion of creative designs per generation is shown in Figure 6: The proportions

0 1 2 3 4 5 6 70

1

2

3

4

5

6

7

Originality

Prac

tical

ity

Generation 1Generation 3



are significantly different (x2(1, N=180)=7.54, P<0.01). Consequently, the comparison hypothesis, that creativity will be greater if designs are combined versus if they are generated anew, is supported.

Figure 5. Proportion of Creative Designs between the First and Last Generation (Error Bars Represent 95% Confidence Intervals)

Figure 6. Proportion of Creative Designs between the Experimental and Control condition (Error Bars Represent 95% Confidence

Intervals)



Among all 240 designs from the three generations and two conditions, the design judged most creative is from generation 3 of the combination condition. Figure 7 shows the sketches and the text of this combined design. The two sketches at the top of the figure are the parent designs of the combined design. The paragraphs above the parent designs are the participants’ explanations of their designs. We can clearly understand from the sketches and verbal explanations how the designer selected and combined the features of the parent designs in order to create a new design.

Triangle shape ALARM CLOCK and there are many option, mechanism clock with digital date display and there are buttons and nob for adjust or change the time and date at right side of the clock. Alarm time set and on/off buttons placed at left side of the clock, there we can fix the alarm time which time we want. At the top of clock there is a sun type speaker for alarm tone. Left side of the clock, tone changer button and volume control buttons are there.

The alarm is set by keeping the very small pointer to needed time by rotating the pointer tip at the back of clock to stop alarm press or pull the ring at the top of the clock.

In my design I combined key elements from both previous designs such as the triangle shape and digital calendar included on the first design as well as having the volume and alarm set functions on the side of the clock. In my design, I also have a ring that you pull on the top of the clock that turns the alarm on and off which is shown in the second design. This clock also has an am/fm radio and a snooze button. It is also in a very stylish pyramid shape and has a sun for the speakers. The volume is controlled by a knob and not buttons and the time can be set using minute/hour combinations. There is also a digital time/date clock in the bottom left hand corner of the face of the alarm clock.

Figure 7. The Design Judged Most Creative (Practicality Score 6.2, Originality Score 6.0). Above and below the sketches are the participants’ design

descriptions.

Discussion and Future Work

In sum, the results demonstrate the effectiveness of combination in two ways. The designs in the last generation were judged more creative than those of the first generation and the system produces more creative designs than a process in which independent designs are generated.



Future research might consider the following questions. First, do some combinations of ideas produce better designs than other combinations (cf. Yu and Sakamoto 2011)? This might be studied by controlling the combination of ideas based not only on the ratings of the ideas, but with respect to different features and objectives (Yu, Sakamato and Nickerson 2011).

Second, is creativity as judged by the crowd related to creativity as judged by product design experts? Creativity studies usually hinge on judgment. Finke et al. warned that experts often have low inter-rater agreement with respect to judged creativity (1996). Here, we used a crowd to judge the output: this has the advantage that a sizeable number of people can be polled, but has the disadvantage that those polled may not have sufficient expertise to judge, for example, the probable expense of one design’s manufacture over another. Thus, future studies might introduce experts at various stages in the process – at the very end to evaluate practicality, and also perhaps after every generation to help define the potential fitness. The judgment of experts might be weighted differently from the judgment of novices (Raykar et al. 2010). It may be possible to test the ability of experts and novices to generate innovations, the implementations of creative ideas, in design domains that admit to inexpensive custom manufacturing, for example simple clothing and jewelry design (Brabham 2010). Moreover, in some cases, the judgment of novices may have advantages: the clocks that novices favored might be more likely to be purchased, or at least preferred, because the novices probably represent the buying demographic better than experts.

Third, what domains does this method work in? In previous work, we have shown that this technique works on an open ended environmental problem, and on another product design problem (Nickerson and Sakamoto 2010; Nickerson, Yu and Sakamoto 2011; Sakamoto and Bao 2011; Yu and Nickerson 2011). There are, however, doubtless domains for which the proposed process is ill suited, as well as domains for which it is well suited, and characterizing these domains might aid in the effective application of this system.

There are also other ways of building idea generation systems. For example, designs might be filtered between generations. Some designs might be enhanced or modified one at a time (a kind of mutation or refinement) before being combined. More than two designs might be combined. Thus, there is a large space of idea generation systems. In principle, alternative systems can be instantiated from this space and tested against each other.

Concluding Thoughts

This work presents a way to structure the crowd to generate creative ideas. The process is grounded in the theory that creativity stems from the combination of ideas. The idea generation process using combination is a variant of a human based genetic algorithm: a crowd produces designs to form an initial design population, a second crowd generate new designs by combining the initial designs, a third crowd generate new designs by combining the designs produced by the second crowd. The results show that the creativity of the designs in the last generation was significantly higher than the creativity of those in the first generation. Moreover, compared to designs from a control condition, in which the same number of designs was collected, the designs evolved through the combination system were judged more creative.

This process has advantages over a current popular technique of crowdsourcing, the online contest model (e.g. Piller 2010; Leimeister et al. 2009). The proposed process organizes the crowd in a highly structured way that allows many people to contribute to a bigger task. Unlike the online contest model, in which only a small number of people contribute to the winning design, the combination system allows for a broader base of fruitful participation. It shows that the crowd can generate creative designs without indulging in competition that may end up wasting good ideas. The structured process of co-design and co-creation is particularly suited to firms or institutions committed to open innovation, because it gives the sponsor control of, and visibility into, the process. The creative ideas evolved through the system are evaluated by the crowd, are thus are arguably more likely to be accepted by the marketplace, especially if the evaluation crowd is chosen to match as closely as possible the eventual consumer.

However, there are attributes of contest-based systems that are also desirable. They provide strong motivation, and may attract individuals with expertise in the problem domain area. Consequently, it may be that hybrid systems could do better than either alternative process. For example, a contest might be used to provide motivation, while at the same time intermediate designs might be made public for use by later generation designers. The Matlab contests (Gulley 2001), in which participants can modify each



other’s entries, have this flavor, but many other hybrid systems can be created by controlling the visibility of the designs and the instructions for combination and evaluation.

At the macro level of institutional innovation, this work suggests that design might be explored as an activity undertaken by hundreds or thousands of individuals who interact through visual and verbal representations. The design undertaken here was not accomplished by a tightly knit online community, but instead was the result of the collaborative co-creation through collective action, without much social interaction. Moreover, the collective action was accomplished through an algorithm, a selection process driven by the crowd itself. Many issues remain to be answered, the most important being the issue of expert versus novice participation, and the eventual market acceptance of the generated designs. But, at the organizational level, this work suggests new kinds of organizational structure much different than firm-based internal innovation processes, and also different from typical market competition. There exists a whole class of unexplored organization structures. Technology makes these structures possible; however, it may be human-driven experimentation that will lead to the discovery of different and effective organizations. Designs may be produced by crowds, emerging out of thousands of constrained interactions: these are new forms of collective creativity.

Acknowledgements

This work was funded by the National Science Foundation under grants IIS-0855995 and IIS-0968561.

References

von Ahn, L. and Dabbish, L. 2004. Labeling images with a computer game. Proc. CHI 2004, ACM Press , 319-326.

Amabile, T. 1996. Creativity in context. Colorado: Westview Press. Amabile, T. 1998. How to kill creativity. Harvard Business Review, September-October, 77-87. Amazon Mechanical Turk. https://www.mturk.com/mturk/welcome. Bederson, B. B., Hu, C., and Resnik, P. 2010. Translation by iterative collaboration between monolingual

users. Proc. Graphics Interface (GI) conference. Benkler, Y. 2006. The wealth of networks: How social production transforms markets and freedom, CT:

Yale University Press. Bernstein, M.S., Little, G., Miller, R.C., Hartmann, B., Ackerman, M.S., Karger, D.R., Crowell, D., and

Panovich, K. 2010. Soylent: a word processor with a crowd inside. In Proceedings of the 23nd annual ACM symposium on User interface software and technology (UIST '10). ACM, New York, NY, USA, 313-322.

Biewald L. and Allick M. 2011. Massive multiplayer Human Computation for fun, money, and survival. CHI 2011 Workshop on Crowdsourcing and Human Computation.

Bao, J., Sakamoto, Y., and Nickerson, J. V. 2011. Evaluating Design Solutions Using Crowds, AMCIS. Boden, M. A. 1996. Dimensions of creativity. MA: The MIT Press. Brabham, D. 2010. Moving the crowd at Threadless: Motivations for participation in a crowdsourcing

application. Information, Communication and Society. 13:1122 - 1145. Buckley P. and Carter M. 2004. A formal analysis of knowledge combination in multinational entrprises,

Journal of international business studies, 35. Chesbrough, H., Vanhaverbeke, W., West, J. 2006. Open Innovation: Researching a New Paradigm. UK:

Oxford University Press Crowston, K., Wei K., Howison J., and Wiggins A. Free/Libre Open Source Software Development: What

we know and what we do not know, ACM computing surveys, in press. Davis, L. 1987. Genetic Algorithms and Simulated Annealing. Los Altos, CA: Morgan Kaufman. Deb, K. 2001. Multi-objective optimization using evolutionary algorithms. UK: Wiley, Chichester. Dennis, A.R., Valacich, J.S., Connolly, T., and Wynne, B.E. 1996. Process structuring in electronic

brainstorming. Information Systems Research, 7 (2):268. Diehl, M., and Stroebe, W. 1987. Productivity loss in brainstorming groups: Toward the solution of a

riddle. Journal of personality and social psychology, 53(3), 497. Ebel R.L. 1951. Estimation of the reliability of ratings. Psychometrika, 16:407-424. Eveleens, C. 2010. Innovation management; a literature review of innovation process models and their

implications. Lectoraat Innovatie Publieke Sector.



Finke R. 1990. Creative imagery: Discoveries and inventions in visualization, New York: Lawrence Erlbaum.

Finke, R. A., Ward, T. B., and Smith, S. M. 1996. Creative cognition: Theory, research, and applications. MA: MIT press.

Fleming, L. 2001. Recombinant uncertainty in technological search, Management Science. 47 (1): 117-132. George, J. 2007. Creativity in organizations. Academy of Management Annals, 1(1), 439-477. Gero J. S., and Maher M. L. (eds) 1993. Modeling creativity and knowledge-based creative design.

Hillside, NJ: Lawrence Erlbaum. Goldschmidt, G., and Litan Sever A. 2009. From text to design solution: inspiring design ideas with texts.

Paper presented at International Conference on Engineering design, ICED’09. Goldberg, D. E. 1989. Genetic Algorithms in Search, Optimization and Machine Learning, MA: Kluwer

Academic Publishers. Gooding, D. C. 2010. Visualizing scientific inference. Topics in Cognitive Science, 2 (1):15-35. Google Docs Drawing Application. http://docs0.google.com/demo/edit?id=scACRQaIm3t83kVISWPhWfrqx#drawing Gulley, N. 2001. Patterns of innovation: a web-based MATLAB programming contest. CHI’01 extended

abstracts on Human factors in computing systems. Hargadon, A. and Bechky, B. 2006. When collections of creatives become creative collectives: A field

study of problem solving at work. Organization Science, 17(4), 484-500. Hayek, F. A. 1945. The use of knowledge in society, American Economic review, 35(4), 519-530. Henderson, R. M. and Clark, K. B. 1990. Architectural innovation: the reconfiguration of existing product

technologies and the failure of established firms. Administrative science Quarterly, 35 (1). Hevner, A., March, S., Park, J., and Ram, S. 2004. Design Science in information systems research, MIS Quarterly , 28:1, 75-105. Holland. J. H. 1975. Adaptation in Natural and Artificial Systems. Ann Arbor: University of Michigan

Press. Howe, J. 2006. The Rise of Crowdsourcing, Wired, 14(6), URL (accessed 8 September 2011): http://www.wired.com/wired/archive/14.06/crowds.html Kosorukoff, A. 2001. Human based genetic algorithm. Paper presented at IEEE Conference on Systems,

Man, and Cybernetics. Kittur, A., Chi, E.H., and Suh, B. 2008. Crowdsourcing user studies with Mechanical Turk. Proc. CHI

2008, ACM, 453-456. Kohn, N. W., Paulus, P. B. and Choi, Y. 2011. Building on the ideas of others: An examination of the idea

combination process. Journal of Experimental Social Psychology, 47, 554-561. Leimeister, J. M., Huber, M., Bretschneider, U., and Krcmar, H. 2009. Leveraging crowdsourcing –

Activation-Supporting components in IT-based idea competitions. Journal of Management Information Systems, 26(1), 197–224.

Litchfield, R. C. 2008. Brainstorming reconsidered: A goal-based view. The Academy of Management Review Archive, 33(3), 649-668.

Malone, T. W., Laubacher, R., and Dellarocas, C. N., 2009. Harnessing crowds: Mapping the genome of collective intelligence. MIT Sloan Research, Paper No. 4732-09. Available at SSRN: http://ssrn.com/abstract=1381502

Mullen, B., Johnson C., and Salas E. 1991. Productivity loss in brainstorming groups: A meta-analytic integration, Basic and Applied Social Psychology, 72(1), 3-23.

Mumford, M. D., Mobley, M. I., Uhlman, C. E., Reiter-Palmon, R,. and Doares, L. 1991. Process analytic models of creative thought. Creativity Research Journal, 4, 91-122.

Nelson, R. R. and Winter S. G. 1982. An evolutionary theory of economic change. Cambridge: Belknap Press/Harvard University Press.

Nickerson, J.V. and Sakamoto, Y. Crowdsourcing Creativity: Combining Ideas in Networks, Workshops on Information in Networks, 2010

Nickerson, J.V., Sakamoto, Y. and Yu, L., Structures for Creativity: The Crowdsourcing of Design, Workshop at CHI, 2011.

Nunamaker, J.F., Dennis, A.R., Valacich, J.S., Vogel, D., and George, J.F. 1991. Electronic meeting systems. Communications of the ACM, 34 (7), 40-61.

Olsson, O. and Frey B. 2002. Entrepreneurship as recombinant growth, Small business Economics, 19: 69-80.



Osborn, A.F. 1957. Applied imagination: Principles and procedures of creative problem solving (Third Revised Edition). New York: Charles Scribner’s Sons.

Paulus, P. B., Larey, T. S., and Ortega, A. H. 1995. Performance and perceptions of brainstormers in an organizational setting. Basic and Applied Social Psychology, 17, 249–265.

Putman, V. L., and Paulus, P. B. 2009. Brainstorming, brainstorming rules and decision making. Journal of Creative Behavior, 43, 23-39.

Piller, F. T. 2010. Open innovation with customers: Crowdsourcing and co-creation at threadless. Available at SSRN: http://ssrn.com/abstract=1688018

Quinn, A. J., Bederson, B. B. Human Computation: A Survey and Taxonomy of a Growing Field. Proc. CHI 2011.

Raykar, V. C., Yu, S., Zhao, L. H., Valadez, G. H., Florin, C., Bogoni, L., and Moy, L. 2010. Learning from crowds. Journal of Machine Learning Research. 11, 7, 1297-1322.

Rietzschel, E. F., Nijstad, B. A., and Stroebe, W. 2006. Productivity is not enough: A comparison of interactive and nominal brainstorming groups on idea generation and selection. Journal of Experimental Social Psychology, 42, 244-251.

Ross, J., Irani, L., Silberman, M.S., Zaldivar, A., and Tomlinson, B. 2010. Who are the crowdworkers? Shifting demographics in Mechanical Turk". In: alt.CHI session of CHI 2010 Extended Abstracts on Human Factors in Computing Systems.

Sakamoto Y. and Bao, J. Testing Tournament Selection in Creative Problem Solving with Crowds, ICIS 2011.

Simonton, D. K. 2003. Scientific creativity as constrained stochastic behavior: The integration of product, person, and process perspectives. Psychological Bulletin, 129, 475-494.

Simonton, D. K. 2004. Creativity in science: Chance, Logic, Genius and Zeitgeist, New York: Cambridge University Press.

Smith, S. M., Ward, T. B., and Schumacher, J. S. 1993. Constraining effects of examples in a creative generation task. Memory and Cognition, 21(6), 837-845.

Solomon, D. J. 2004. The rating reliability calculator. BMC Medical Research Methodology (4:11). Spears, W. M. 1993. Crossover or mutation? In Foundations of Genetic Algorithms 2, edited by L.D.

Whitley. CA: Morgan Kaufmann. Stevens, G. A. and Burley, J. 1997. 3000 Raw ideas=1 Commercial success!, Research-Technology

Management, 16-27. Taylor, A., and Greve, H. R. 2006. Superman or the fantastic four? Knowledge combination and

experience in innovative teams. Academy of Management Journal, 49(4), 693-706. Thagard, P. 1992. Conceptual revolutions. New Jersey: Princeton University Press. Tolstoy D. 2009. Knowledge combination and knowledge creation in a foreign–market network. Journal

of small business management, 47(2), 202-220 Tversky, B. and Chou, J. 2010. Creativity: depth and breadth. Paper presented at the First International

Conference on Design Creativity. Yu, L. and Nickerson, J.V. Cooks or Cobblers? Crowd Creativity through Combination. Proceedings of the

CHI’11 Conference on Human Factors in Computing Systems, ACM Press, 2011. Yu L. and Sakamoto, Y. 2011. Feature Selection in Crowd Creativity. HCII, Springer. Yu L. Sakamoto, Y., and Nickerson, J.V. 2011. Feature Propagation in Idea Networks. Workshop on

Information in Networks. Wasko, M. M. L., and Faraj, S. 2005. Why should I share? Examining social capital and knowledge

contribution in electronic networks of practice. MIS Quarterly, 29(1), 35-57.

Generating creative ideas through crowds: An experimental study of ...

Documents

Transcript of Generating creative ideas through crowds: An experimental study of ...