Implementing an Open-access, Data ... - Computer...

13
Implementing an Open-access, Data Science Programming Environment for Learners Austin Cory Bart * , Javier Tibau *† , Eli Tilevich * , Clifford A. Shaffer * , Dennis Kafura * * Computer Science Virginia Tech Blacksburg, Virginia, USA [email protected], [email protected], [email protected], [email protected], [email protected] Escuela Superior Polit´ ecnica del Litoral, ESPOL Km 30.5 V´ ıa Perimetral, Guayaquil, Ecuador Abstract—A key retention issue when ed- ucating computing novices is ensuring that the frustrations of mastering programming fundamentals do not demotivate students. Non-CS majors often struggle to find rel- evance in traditional computing curricula that tend to either emphasize abstract con- cepts, focus on non-practical entertainment (e.g., game/animation design), or rely on de- contextualized settings. To assist with these issues, this paper introduces BlockPy (https: //www.blockpy.com), a web-based, open- access Python programming environment that supports introductory programmers in a Data Science context. It promotes long- term transfer by scaffolding an introduc- tion to textual programming through a block-based programming view, ideal for beginners. BlockPy is designed for informal learners and formal classes, and provides guiding feedback within interactive pro- gramming problems. The results from a pilot of the initial deployment of BlockPy indicates that the environment addresses many problems faced by novice learners. Keywords-Computer science education; Computer aided instruction; Data analysis; Web services; As computing becomes pervasive in our society across fields, working profes- sionals increasingly need some expertise in computing alongside their core domain knowledge. General education comput- ing curricula at the university level (e.g., “Computational Thinking” courses) are scaling, Massively Open Online Courses are flourishing, and a large class of learn- ers are pursuing non-formal learning ex- periences on their own. Both these tra- ditional and non-traditional learners of- ten have little experience with computing, low self-efficacy, and are uncertain how computing benefits their long-term career goals. Not only do they need special scaf- folding unique to their ability and moti- vational level, but they also need fewer barriers in accessing these materials. Our new tool to better serve this population is BlockPy: an open-access, web-based Python environment for data science that supports learners with guided instruction and an accessible interface (https://www. blockpy.com). Why Data Science? Modern ap- proaches to contextualizing introductory

Transcript of Implementing an Open-access, Data ... - Computer...

Page 1: Implementing an Open-access, Data ... - Computer Sciencepeople.cs.vt.edu/~tilevich/papers/IEEEComputer2017.pdf · Computer Science Virginia Tech Blacksburg, Virginia, USA acbart@vt.edu,

Implementing an Open-access, Data ScienceProgramming Environment for Learners

Austin Cory Bart∗, Javier Tibau∗†, Eli Tilevich∗, Clifford A. Shaffer∗, Dennis Kafura∗∗Computer Science

Virginia TechBlacksburg, Virginia, USA

[email protected], [email protected], [email protected], [email protected], [email protected]†Escuela Superior Politecnica del Litoral, ESPOL

Km 30.5 Vıa Perimetral, Guayaquil, Ecuador

Abstract—A key retention issue when ed-ucating computing novices is ensuring thatthe frustrations of mastering programmingfundamentals do not demotivate students.Non-CS majors often struggle to find rel-evance in traditional computing curriculathat tend to either emphasize abstract con-cepts, focus on non-practical entertainment(e.g., game/animation design), or rely on de-contextualized settings. To assist with theseissues, this paper introduces BlockPy (https://www.blockpy.com), a web-based, open-access Python programming environmentthat supports introductory programmers ina Data Science context. It promotes long-term transfer by scaffolding an introduc-tion to textual programming through ablock-based programming view, ideal forbeginners. BlockPy is designed for informallearners and formal classes, and providesguiding feedback within interactive pro-gramming problems. The results from apilot of the initial deployment of BlockPyindicates that the environment addressesmany problems faced by novice learners.

Keywords-Computer science education;Computer aided instruction; Data analysis;Web services;

As computing becomes pervasive in

our society across fields, working profes-sionals increasingly need some expertisein computing alongside their core domainknowledge. General education comput-ing curricula at the university level (e.g.,“Computational Thinking” courses) arescaling, Massively Open Online Coursesare flourishing, and a large class of learn-ers are pursuing non-formal learning ex-periences on their own. Both these tra-ditional and non-traditional learners of-ten have little experience with computing,low self-efficacy, and are uncertain howcomputing benefits their long-term careergoals. Not only do they need special scaf-folding unique to their ability and moti-vational level, but they also need fewerbarriers in accessing these materials. Ournew tool to better serve this populationis BlockPy: an open-access, web-basedPython environment for data science thatsupports learners with guided instructionand an accessible interface (https://www.blockpy.com).

Why Data Science? Modern ap-proaches to contextualizing introductory

Page 2: Implementing an Open-access, Data ... - Computer Sciencepeople.cs.vt.edu/~tilevich/papers/IEEEComputer2017.pdf · Computer Science Virginia Tech Blacksburg, Virginia, USA acbart@vt.edu,

courses have focused on making the ex-perience “fun” and “interesting”, with anemphasis on game design and media com-putation. However, student motivation isa complex construct dependent on morethan just situational interest; holistic mod-els of motivation suggest that studentsalso need to feel that the material is usefulto learn, and that long-term career goalsare satisfied [1]. Contexts like MediaComputation are not always perceived asauthentically useful for non-majors, basedon a study by Guzdial et al [2].

We suggest that Data Science is amotivating context that can appeal in adifferent way to students, thanks to thewide-spread need for data processing inother majors. In a previous work, we havereported on the affordances and impactsof Data Science as a learning context [3].Often, students study computing to learnhow to manage the dizzying quantitiesof data being stored and analyzed in adiscipline or for a specific self-derivedproject. By grounding the content in thiscontext, students can be more easily con-vinced of the relevance of computing andunderstand how the materials fit togethermore clearly. By aligning the context withstudents’ long-term needs, students alsolearn skills more relevant to those needs.Finally, data science as a context naturallylends itself to teaching topics related tostructured data, iteration, and other corematerial, making it a pedagogically valu-able context to the CS instructor.

Why Python? Python has become oneof the most popular introductory pro-gramming languages [4], thanks to itssimple syntax combined with impressivepower. It includes strong support for data

science thanks to popular libraries likeMatPlotLib. Python requires little code toaccomplish interesting things, so novicesare not bogged down with complex syn-tactical details. Its wide-spread use in bothintroductory classes and industry furthermotivated our choice.

Why Blocks? Any kind of program-ming is a challenge to beginners, due tothe nature of coding as the “most pow-erful, but least usable human-computerinterface ever invented” [5]. Block-basedlanguages (such as Scratch and AppIn-ventor) have been shown to mitigate thestart-up time for students to begin pro-gramming and accomplish tasks [6], [7].By providing structure and an immediateview into the entire user interface of a lan-guage, blocks greatly benefit introductorylearners.

Why Another Python Web Environ-ment? There are several environmentsavailable today that let students and in-structors write Python in the browser, in-cluding CodeSkulptor [8], Pythy [9], andthe Online Python Tutor [10]. BlockPystands on the shoulders of giants, in-tegrating features inspired by these en-vironments and introducing novel ones.But none of these existing Python envi-ronments transitions students into textualprogramming languages.

BlockPy was designed to provide dualsupport for both block-based and text-based code authoring. At any time, thestudent can switch freely between ablock-based view of their code and atraditional text-based view. This powerfulfeature is inspired by Pencil Code, whichuses its own Logo language [11], and sim-ilar implementations have been successful

Page 3: Implementing an Open-access, Data ... - Computer Sciencepeople.cs.vt.edu/~tilevich/papers/IEEEComputer2017.pdf · Computer Science Virginia Tech Blacksburg, Virginia, USA acbart@vt.edu,

Figure 1. BlockPy in action

as a fading scaffold for students [12].

BlockPy extends Pythy’s [9] supportfor “assignments”, problems that integratepresentation with assessment. However,Pythy only supports traditional unit test-ing to provide students with feedback,while BlockPy provides an API for codeanalysis and free-form text guidance thatinstructors can configure to give help-ful suggestions to their students. Further-more, Pythy has limited support for datascience, whereas BlockPy has a rich li-brary of data sources and a MatPlotLib-based plotting API.

CodeSkulptor, Pythy, and BlockPy alluse the same internal engine for run-ning Python code (“Skulpt”). AlthoughCodeSkulptor has an extensive API forcreating user interfaces and games, itprovides a non-standard library. Althoughsuitable for beginners, this library doesnot aid the transition to serious program-ming environments. In BlockPy, the phi-

losophy is to maintain approximate com-patibility with real systems. Instead ofa custom plotting API, for instance, wemimic the MatPlotLib interface.

OnlinePythonTutor has proven to be auseful tool for visualizing program state.However, OPT gives a depth of detailthat can overwhelm introductory students(e.g., terminology such as Frames and Ob-jects, which might be foreign to students).BlockPy’s state explorer does not attemptto match OPT’s thoroughness, but in-stead provides a helpful yet simple pictureof program state. Additionally, we avoidOPT’s server dependency by relying onSkulpt, which runs in the browser.

I. BLOCKPY

The primary design goals of BlockPyare as follows.

1) Reduce barriers to learning pro-gramming.

2) Promote authenticity by empower-ing students to complete real-world

Page 4: Implementing an Open-access, Data ... - Computer Sciencepeople.cs.vt.edu/~tilevich/papers/IEEEComputer2017.pdf · Computer Science Virginia Tech Blacksburg, Virginia, USA acbart@vt.edu,

problems.3) Promote maturity by faded

scaffolds (e.g., transitioning fromblocks to text).

4) Minimize the need for help fromhuman instructors.

A. Open-source, Open-accessThe fundamental vision of BlockPy is

a highly accessible, web-based platformfor anyone to learn how to program. Allcode is open-source, and leverages a num-ber of open-source libraries. There is noregistration needed to use the software,although there are features that benefitfrom free registration, such as managingclassrooms. We provide guided learningmaterials, to be shared by educators.

The BlockPy editor continuously storesuser code as it is entered. Logs are storedat the keystroke level for future programanalysis (described in Section I-E). Thelatest version of the user’s code is there-fore available between sessions. Whenoperating in offline mode, the code isstored in the LocalStorage browserobject; when the connection is reestab-lished, synchronization is performed.

B. Python ExecutionThe BlockPy system is built to work

offline, ideal for places where internetconnectivity is unreliable. Python CodeExecution is achieved through a modifiedinstance of the Skulpt JavaScript library.Skulpt is a full Python parser and com-piler, supporting almost all Python lan-guage features by generating JavaScriptcode. This includes partial support forthe rich Python standard library. TheSkulpt execution environment resides en-tirely within the users’ browser, so there is

no reliance on an external server beyondthe initial page load.

C. Block-based Python

To support introductory learners as theygrapple with Python syntax, the initialinterface in BlockPy is block-based, us-ing the popular Blockly JavaScript li-brary. Language features (iteration, deci-sion, variable assignment and access, etc.)are contained in a toolbox on the left side,from which users drag-and-drop blocksonto a canvas. BlockPy’s block interfaceonly generates syntactically valid Pythoncode, enforced by the “snapping” connec-tors of the blocks (although it is possibleto generate semantically incorrect code –discussed later). This block interface issynchronized with a text interface; sectionI-H describes these two interfaces.

An important question is how manylanguage details should be exposed, and atwhat rate. A rarely used feature of forloops in Python is to contain an elseclause that is executed upon successfulcompletion of the loop (that is, when it isnot prematurely escaped using a breakstatement). This advanced language fea-ture is similar to a finally statementwith exceptions. However, if an elseclause were made available to beginnersfirst trying to grapple with iteration, itis likely they would confuse the conceptwith the conditional else clause used inif statements. Cognitive load is harsh tobeginners, and the user interface needs toavoid exposing unnecessary details wherepossible. While hiding else bodies in afor loop is a clear case, there are moresubtle examples. It can be difficult torecognize when the learner is ready to use

Page 5: Implementing an Open-access, Data ... - Computer Sciencepeople.cs.vt.edu/~tilevich/papers/IEEEComputer2017.pdf · Computer Science Virginia Tech Blacksburg, Virginia, USA acbart@vt.edu,

parallel assignment, and therefore shouldbe able to specify multiple variables onthe left side of an assignment block. Ablock-based language forces a teacher tomake important decisions about how toexpose language features. As part of thefuture work of BlockPy, we are experi-menting with exposing language featuresat different rates, adjustable by the in-structor, so the system can expose a pro-gressively more accurate language model.

D. Adaptive Guided Practice

One of the most powerful features ofBlockPy is the interactive, guided feed-back feature. A limitation of program-ming environments like Snap! is that theyare not pedagogically interactive – stu-dents completing an assignment in thesystem are not guided to success. Thelearner must decide when they have com-pleted their program, and whether it meetsthe specification. For independent learnersoutside of a formal learning experience,this requires high levels of self-regulationand meta-cognition. BlockPy’s adaptiveelements follow an Expert model; whenstudents run their code, it is checkedagainst instructor-provided logic. If thestudent code fails for some reason, theyare offered a suggestion. Correct codegives a green “Complete” mark – wehave found that this positive marker hasmotivating power.

In the Instructor Mode, teachers pro-vide a problem instance. First, a WYSI-WYG rich-text editor edits the problemdescription, supporting any valid HTMLcontent (e.g., images, links). Second, theinstructor provides code in special can-vases that affect the students’ experience.

Instructors write this code using the sametext/block interface that students use. Firstis the “starting code”, shown to the stu-dent when they begin the problem, avoid-ing a blank canvas. The second instructorcode defines interactive feedback, whichcan access the students’ code, their fi-nal output, and a complete trace of theirprogram’s state. The checking system candeclare the code to be correct, or dis-play an HTML string that is rendered asuser feedback. The instructor is free towrite whatever logic they want, such assearching for a specific Abstract SyntaxTree (AST) element, testing the outputson the console, or walking through theprograms state to satisfy invariants. AnAPI for common checks is evolving basedon common use cases, such as parsingthe program’s AST to ensure that they arenot calling forbidden built-in Python func-tions. We are also exploring interventionsan instructor can make beyond render-ing textual feedback; perhaps displayinga pop-up dialog with an embedded in-structional video, or alerting an instructorto provide Just-In-Time instruction fora particular struggling student. Finally,we are experimenting with a new systemfor modeling students’ misconceptions re-lated to programming, in order to providebetter guidance tuned to students’ specificmistakes.

E. Program Analysis for Deeper Learn-ing

BlockPy uses simple program analysistechniques to find both general mistakesthat novices make, and problem-specificerrors. For example, beginners often failto understand the true purpose of certain

Page 6: Implementing an Open-access, Data ... - Computer Sciencepeople.cs.vt.edu/~tilevich/papers/IEEEComputer2017.pdf · Computer Science Virginia Tech Blacksburg, Virginia, USA acbart@vt.edu,

variables, and incorrectly include themdue to mimicking an example. By per-forming simple variable liveness analyses,we can identify these variables and raisean error. Most modern editors feature thiskind of analysis, but usually trust thatthe user has a reason and responds pas-sively; the nature of our learners requiresstronger responses.

Since we securely record the historyof users’ programs and log interface in-teractions, we can mine this repositoryof code to infer common patterns thatsuggest undesirable learner behavior. Forexample, users that frequently move thesame blocks without progressing in theproblem objectives might be indicative oftaking longer on the problem than otherusers. Alternatively, students who picka decision block to complete a problemabout iteration might need extra atten-tion. By proving or disproving such hy-potheses, we can improve the automaticfeedback of the system and provide moreindividualized support.

F. LTI Support

BlockPy supports the LTI (Learn-ing Technology Interoperability) protocol.This is a mechanism by which instruc-tors can embed questions in their existingcourse management software (e.g., Can-vas, OpenEdX) and receive assignmentoutcomes (e.g., a grade).

A typical learner uses BlockPy with-out ever being aware of LTI. An in-structor using the system obtains a se-cret key and configuration URL that isused within their Learning ManagementSystem. Students on a course websitemay use BlockPy without registering for

a BlockPy account – the first time theylog in through their LMS’ provided link,they are invisibly registered in the systemwith a regular account through additionalinformation from the LMS. As studentscomplete work, assignment progress isreported back to the LMS. Instructors usea special interactive menu for managingexercises associated with a course.

G. Data Science Blocks

BlockPy focuses on Data Science as itsprimary context, and so we have blocksfor working with data. We support a sub-set of the popular MatPlotlib library, andextend it with connections to the CORGISproject [13]. The MatPlotLib API, forexample, provides a “plot(list)” functionto create simple line plots. By mimickingMatPlotLib, students can seamlessly shiftto a serious Python programming environ-ment without loss of code.

The CORGIS (Collection of Real-time, Giant, Interesting, Situated datasets)Project makes motivating datasets avail-able to introductory students through sim-ple programming libraries [13]. Thesedatasets are drawn from many disciplines,resulting in material meant to be univer-sally interesting and relevant. Currently,BlockPy supports a number of differ-ent CORGIS libraries including weatherdata, earthquake data, United States crimestatistics, and a classic book dataset.

H. Mutual Language Translation

A technical contributions of this projectis the mutual language translation be-tween Blockly and Python. Blockly out-puts valid python source code, which canbe passed into Skulpt to extract a JSON

Page 7: Implementing an Open-access, Data ... - Computer Sciencepeople.cs.vt.edu/~tilevich/papers/IEEEComputer2017.pdf · Computer Science Virginia Tech Blacksburg, Virginia, USA acbart@vt.edu,

Figure 2. User perspective of block/text transition

representation of the Abstract SyntaxTree. This AST is parsed using our ownPy2Block library to generate an XMLrepresentation that Blockly can render inthe Block View. Figure 2 demonstrates theusers’ experience. When a student triesto convert code with disconnected blocks,the generated Python code will be filledin with triple underscores. These under-scores (usually a valid variable name inPython) will trigger a run-time error.

Blockly already supports compilationof its blocks to Python, JavaScript, PHP,and Dart. However, this multiple languagesupport causes reduced isomorphism—each language has different syntax fortheir common operations, and it is impos-sible to create a fully-featured block lan-guage with a one-to-one mapping to themall. For example, JavaScript has no sup-port for parallel assignment, a commonly-used feature in Python, while Pythondoes not have a unary increment operator.Blockly itself has syntax and vocabularydescended from Logo.

Instead of trying to satisfy multiplelanguages, we have dropped support forother languages for a more fully-featuredmapping to Python. This requires mi-

nor changes that introduce Python-centricsyntax details: function blocks are labeled“define”, assignment blocks have an “=”symbol, the “add item to list” block is re-named to “append”. Blockly has also beenextended with new language features, in-cluding dictionary access and creation.

Eventually, the interface should offer acomplete isomorphic mapping to Python.However, there are a number of complica-tions to resolve first. For instance, Pythonuses square brackets for both list indexingand dictionary access. There is a strongdesire to differentiate between these typesof access, visible in the block view as twodistinct kinds of blocks (“get ith elementof list” vs. “get key from dict” blocks).However, it is computationally difficult tostatically identify the usage of a givenpair of brackets– sophisticated programanalysis techniques are needed.

I. Parson’s Problems

Parsons’ Problems are a special type ofcoding exercise where all of the necessarycode blocks are present, but disconnectedand shuffled. These kinds of problemsscaffold beginners by providing every-thing they need to complete the problem,reducing many of the barriers to gettingstarted. BlockPy supports these types ofproblems with a special “Parsons Mode”where top-level blocks are shuffled in theblock mode.

J. State Explorer

BlockPy provides a State Explorer,used to trace programs’ execution overtime. The State Explorer displays morethan just information about variables:

Page 8: Implementing an Open-access, Data ... - Computer Sciencepeople.cs.vt.edu/~tilevich/papers/IEEEComputer2017.pdf · Computer Science Virginia Tech Blacksburg, Virginia, USA acbart@vt.edu,

Users can step through the code’s execu-tion, affecting what is currently printed/-plotted, imported modules, and the valuesand types of variables.

II. MODEL USE CASES

In this section, we consider some ex-ample scenarios that describe our visionof typical BlockPy use cases. Our intentis for BlockPy to be useful in both formaland informal situations.

A. Independent Learner

A learner independently logs into theBlockPy system and selects an introduc-tory problem on calculating averages us-ing iteration: “Is the weather in Seat-tle above 60 degrees Fahrenheit? PrintYes or No.” As a complete novice, theyare unsure what to do after reading theproblem description. If they decide tocheat by checking the current weatherin Seattle and printing the literal value,the system intelligently notices that theyare missing a relevant weather block, andexplains that they need to combine pro-grammatic decision logic with the appro-priate data source. They think to accessthe “Weather” block category, and grabthe weather get block, but are unsurewhat to do next. When they run theirprogram, the system notices that theyhave not used any IF statements, andsuggests reading a linked chapter in anonline textbook. If they continue to strug-gle with integrating pieces, the system canprovide increasingly detailed hints untilthey succeed.

B. Classroom Lesson

Another common use case for the sys-tem would be an instructor with a large

classroom of students. The instructor isusing Canvas, an LTI-capable LMS. Theycreate a series of assignments for theday’s classwork. Students log into Canvasand begin working on the assignments.As they complete the assignments, theirgrade is reported to Canvas. The instructorcan monitor progress for the class andcheck which students are struggling tocomplete assignments. This informationcan allow them to target under-performerswith earlier interventions. The more au-tomatic feedback that instructors makeavailable, the less they need to focus onsimple problems (“You were checking thetemperature for the wrong city.”) and themore they can focus on students that aretruly struggling (“What is iteration?”).

C. 1-1 Tutoring

On several occasions, we have foundBlockPy to be a useful tool for correct-ing individual students’ misconceptions.In particular, the block representation ofprograms can help beginners grasp thatcode is not a series of symbols but astructured representation of an algorithm.Consider a student struggling to write thenecessary syntax for indexing a nesteddictionary (e.g., a crime report brokeninto multiple levels, with the burglary ratefor a city nested under a violent crimecategorization). The student may not havea clear image of how the layered structureof data can translate into a chain of dic-tionary accesses. Sitting with the student,the instructor could build up an expres-sion accessing the data by connectingtogether dictionary access blocks to thedata block, showing the generated Pythoncode at each step. The learner can visually

Page 9: Implementing an Open-access, Data ... - Computer Sciencepeople.cs.vt.edu/~tilevich/papers/IEEEComputer2017.pdf · Computer Science Virginia Tech Blacksburg, Virginia, USA acbart@vt.edu,

see how chunks of the code correlate toblocks.

III. PILOT STUDY

BlockPy was piloted in an introduc-tory Computational Thinking course with35 students in Spring 2015. These stu-dents come from a diverse range of ma-jors, including liberal arts (57%), archi-tecture (17%), and sciences (15%). Therewere 20 female students (57%) and 15male. The vast majority of students re-ported no prior experience in program-ming, less than 17% having taken the highschool AP course. Students were evenlydistributed across years, with slightlymore seniors (29%), equal percentages ofsophomores and juniors (26%), and fewerfreshmen (14%).

The course content focused on teach-ing Abstraction and Algorithms. Whileprogramming was not a primary learningobjective, was is an important topic inthe course for concretely talking abouthigher level objectives. The first third ofthe course, students worked with NetLogo(although they do not program in it, theydo read code) and participated in explana-tory kinesthetic activities. Then studentswere introduced to Python using BlockPy,where they spent roughly six classes oncompleting guided practice problems. Thenext two classes were devoted to usinga regular Python environment (Spyder) tocomplete small programming assignments(similar to the ones done with BlockPy).Finally, students were given eight classperiods to work on an individual finalproject in Spyder.

Figure 3. Responses from Survey on BlockPy

A. Methodology

Student responses to BlockPy were col-lected through two surveys, one givenafter the BlockPy section and the othergiven at the end of the course. The surveywas composed of 4-point Likert questionsand open-ended qualitative questions. Aselection of particularly interesting resultsare shown in Figure 3. All conclusionsfrom this study should be considered pre-liminary, since it was with the first versionof the BlockPy environment.

B. Perceptions of BlockPy

The first survey question asked whetherstudents wanted more time with each ofthe programming environments they used

Page 10: Implementing an Open-access, Data ... - Computer Sciencepeople.cs.vt.edu/~tilevich/papers/IEEEComputer2017.pdf · Computer Science Virginia Tech Blacksburg, Virginia, USA acbart@vt.edu,

in the course: NetLogo, BlockPy, or Spy-der. Note that BlockPy was referred toas “Blockly”, and the Spyder environmentwas referred to as “Python”. These resultssuggest that students valued their experi-ences with BlockPy more than NetLogo,but mostly felt that they were not get-ting enough Python experience. This isbacked up by the qualitative data, wheresome students say “More Blockly, LessPython”, but others ask for “More Blocklyand More Python”.

C. Usage of BlockPy

Over the six days spent using BlockPy,students were tasked with 40 classworkquestions and 19 homework questions.Students ran their code an average of4 times per problem (standard deviation1.8).

Students were asked if they felt suc-cessful in the transition from BlockPy toSpyder. Only 65% of the class agreed orstrongly agreed, suggesting that there wasa sizeable population that felt uncomfort-able during that transition. The originaldesign of the mutual language translationfeatured the block and text view simulta-neously, side-by-side. However, analysisof the logs reveals that most students didnot take advantage of the feature. Only5 students (roughly 15%) had used theconversion functionality at all, and fewerused it consistently. It is possible thatstudents were observing the code as itchanged, but they were not writing textualcode. It is difficult to say why exactlystudents did not take advantage of it.Our current hypothesis is that studentswere confused by the interface, whichrequired manual conversion to go from

text to blocks. In our new version, theconversion happens automatically, simplyby switching tabs, and we provide in-tentional opportunities for the studentsto switch. Preliminary data suggests thisnew interface greatly improves students’transition.

Students were surveyed about whathelped their learning the most. Peer learn-ing and instructors were about on par withthe automatic feedback given in BlockPy,suggesting the strong value of the system.Despite the popular response to the StateExplorer, relatively few students took ad-vantage of it (11 students, roughly 31%).Since more than 50% of the class reportedfinding value in the data explorer, it ispossible that the students benefited frominstructor presentations of the tool, evenif they didn’t take advantage of it them-selves.

D. Data Science Context

Students were surveyed about their per-ceptions of the value of different courseexperiences with regards to their long-term career goals and their interest inpotential contexts for introductory com-puting courses. Each of these contextswere briefly described – for example, theMedia Comp context was listed as “work-ing with pictures, sounds and movies.”Both sets of results suggest that studentsfind data science to be compelling, butthis should be taken with a grain of salt,since students have negligible experiencewith alternative contexts. However, ourpreliminary results suggest that this is anapproach worth exploring further.

Page 11: Implementing an Open-access, Data ... - Computer Sciencepeople.cs.vt.edu/~tilevich/papers/IEEEComputer2017.pdf · Computer Science Virginia Tech Blacksburg, Virginia, USA acbart@vt.edu,

IV. FUTURE WORK

BlockPy is an evolving project. Wehave a number of features planned toexpand Python support. We are also plan-ning on expanding support for the guidedfeedback API for instructors, such asleveraging more static/dynamic type in-ference techniques to improve block ren-dering and error reporting.

We also have research questions posedby the block-based nature of the interface.One of the biggest values of a block-based environment is that it can imme-diately expose the breadth of a rich API.This greatly reduces students’ dependencyon documentation. Of course, exposingthis breadth can also be a downside, asstudents might be overwhelmed by thefeatures in the interface. It is an openresearch question to decide what rate toexpose language features.

One of the major advantages of gameand animation design as an introductorycontext is that they make abstract con-cepts concrete. Further analysis is neededto determine the trade-offs of using differ-ent contexts. BlockPy can support this bysupporting these alternative contexts, suchas turtle graphics and media computationlibraries.

It is difficult to derive conclusive resultsfrom our pilot due to the small populationsize and the evolving nature of BlockPy.Preliminary results from more in-progressstudies suggest that recent improvementshave overcome a number of limitationsto the environment and user feedback hasdramatically improved. We are conduct-ing follow-up studies on the logged stu-dents’ code, even as we collect more dataon the newest iteration. We are hopeful

that BlockPy will increase its user base,providing a larger sample of learners toconduct research on, and provide moremeaningful data.

A. Missing Language features

BlockPy is being developed in an on-demand fashion, driven by immediatecourse needs, but is still limited. Forexample, the block interface does notsupport a number of advanced Pythonfeatures, such as an interface for writ-ing Object-oriented classes. This does notmean that students cannot write programsfeaturing classes or other advanced fea-tures. Python code using these featureswill render in BlockPy as embedded textblocks and will execute through Skulptnormally. There is no technical imped-iment to supporting these features, theprocess is limited only by time and com-munity interest.

V. CONCLUSION

In this paper, we have introducedBlockPy, our block-based environment forPython. It is open-source and available foruse for free at https://www.blockpy.com/.We believe that BlockPy represents a newparadigm for introductory learners, blend-ing interactive support with a strong pathto programming maturity. By teaching inthe context of data science, we provideauthenticity even as we move studentsout of the system towards a more seriousenvironment. Research with BlockPy willhelp answer crucial questions about thevalue of data science and blocks. Ourhope is that BlockPy’s open nature canencourage learners from diverse fields toengage with computing in a way that

Page 12: Implementing an Open-access, Data ... - Computer Sciencepeople.cs.vt.edu/~tilevich/papers/IEEEComputer2017.pdf · Computer Science Virginia Tech Blacksburg, Virginia, USA acbart@vt.edu,

will lead to a computing-rich future fora larger population.

ACKNOWLEDGMENTS

We gratefully acknowledge the sup-port of the National Science Founda-tion under Grants DGE-0822220, DUE-1444094, and DUE-1624320.

REFERENCES

[1] B. D. Jones, “Motivating students toengage in learning: The MUSIC modelof academic motivation,” InternationalJournal of Teaching and Learning inHigher Education, vol. 21, no. 2, pp.272–285, 2009.

[2] M. Guzdial and A. E. Tew, “Imagi-neering inauthentic legitimate peripheralparticipation: an instructional design ap-proach for motivating computing educa-tion,” in Proceedings of the second inter-national workshop on Computing educa-tion research. ACM, 2006, pp. 51–58.

[3] A. C. Bart, R. Whitcomb, E. Tilevich,C. A. Shaffer, and D. Kafura, “Com-puting with corgis: Diverse, real-worlddatasets for introductory computing,” inProceedings of the 48th ACM TechnicalSymposium on Computer Science Educa-tion, ser. SIGCSE ’17, 2017.

[4] P. Guo, “Python is now the most popu-lar introductory teaching language at topus universities,” BLOG@ CACM, July,2014.

[5] A. Ko, “Programming languages are theleast usable, but most powerful human-computer interfaces ever invented,”http://blogs.uw.edu/ajko/2014/03/25/programming-languages”, 2014.

[6] T. W. Price and T. Barnes, “Comparingtextual and block interfaces in a noviceprogramming environment,” in Proceed-ings of the Eleventh Annual Interna-tional Conference on International Com-puting Education Research, ser. ICER’15, 2015, pp. 91–99.

[7] D. Weintrop and U. Wilensky, “Us-ing commutative assessments to com-pare conceptual understanding in blocks-based and text-based programs,” in Pro-ceedings of the Eleventh Annual Interna-tional Conference on International Com-puting Education Research, ser. ICER’15, 2015, pp. 101–110.

[8] T. Tang, S. Rixner, and J. Warren, “Anenvironment for learning interactive pro-gramming,” in Proceedings of the 45thACM Technical Symposium on ComputerScience Education, ser. SIGCSE ’14,2014, pp. 671–676.

[9] S. H. Edwards, D. S. Tilden, and A. Al-levato, “Pythy: Improving the introduc-tory python programming experience,” inProceedings of the 45th ACM TechnicalSymposium on Computer Science Educa-tion, ser. SIGCSE ’14, 2014, pp. 641–646.

[10] P. J. Guo, “Online python tutor: Em-beddable web-based program visualiza-tion for cs education,” in Proceedingof the 44th ACM Technical Symposiumon Computer Science Education, ser.SIGCSE ’13, 2013, pp. 579–584.

[11] D. Bau, M. Dawson, and A. Bau, “Usingpencil code to bridge the gap betweenvisual and text-based coding (abstractonly),” in Proceedings of the 46th ACMTechnical Symposium on Computer Sci-ence Education, ser. SIGCSE ’15, 2015,pp. 706–706.

[12] Y. Matsuzawa, T. Ohata, M. Sugiura, andS. Sakai, “Language migration in non-cs

Page 13: Implementing an Open-access, Data ... - Computer Sciencepeople.cs.vt.edu/~tilevich/papers/IEEEComputer2017.pdf · Computer Science Virginia Tech Blacksburg, Virginia, USA acbart@vt.edu,

introductory programming through mu-tual language translation environment,”in Proceedings of the 46th ACM Tech-nical Symposium on Computer ScienceEducation, ser. SIGCSE ’15, 2015, pp.185–190.

[13] A. C. Bart, “Situating computationalthinking with big data: Pedagogy andtechnology (abstract only),” in Proceed-ings of the 46th ACM Technical Sym-posium on Computer Science Education,ser. SIGCSE ’15, 2015, pp. 719–719.