Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University...
-
Upload
hillary-caldwell -
Category
Documents
-
view
217 -
download
0
Transcript of Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University...
![Page 1: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/1.jpg)
Dynamic Benchmarking
Software development
though competition
Alex Dubreuil
Northeastern University
![Page 2: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/2.jpg)
This slide intentionally left blank
![Page 3: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/3.jpg)
Contents
• Dynamic Benchmarking Introduction
• Uses of the Benchmarking Game model
• Software Development (CS 4500)
• A Lesson I’ve learned
Caution: Slide layout may cause drowsiness.
![Page 4: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/4.jpg)
Benchmarking
• Assesses relative performance
• Typically by running standardized tests– Produces scores which are then compared– SATs
• Other options exist– Allowing software to compete directly– Chess game
![Page 5: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/5.jpg)
The Traditional Approach
Software A
Static Benchmark
Software B
Software C
Score A
Score C
Score B
Developer A
Developer B
Developer C
Parameterized by the domain.
![Page 6: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/6.jpg)
The Dynamic Approach
Team ASoftware A
Benchmark A
Team CSoftware C
Benchmark C
Team BSoftware B
Benchmark B
ArtificialWorld(Game)
AgentRanking
Parameterized by the domain.
Agent
Agent
Agent
![Page 7: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/7.jpg)
An Artificial WorldAgent’s View
Administrator
Agent
Opponents’ communication,Feedback
Beliefs,Challenges,Problems,Solutions
Results
• Problems: Benchmark output• Solutions: Software output• Beliefs/Challenges: statements about algorithms
![Page 8: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/8.jpg)
Problems & Solutions
• Problem communication:– Define an instance of a problem in the domain
• Solution communication:– Respond to an opponent’s problem– Administrator has a metric for determining
how good a solution is– This metric is well defined and known by all
![Page 9: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/9.jpg)
Beliefs & Challenges
• General statements about algorithms– Belief:
• Defines a subset of the problems in the domain• Makes a statement about the problems in that
subset
– Challenge:• A response to a belief of an opponent
![Page 10: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/10.jpg)
Administrator
• Opponents’ communication– Filter all communication through the
Administrator for security– Filter information when necessary
• Feedback:– Inform agents of rule violations– Inform agents of status changes
![Page 11: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/11.jpg)
Administrator
• Results– Track state changes through the game– Produce the agent ranking from the end game
state
![Page 12: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/12.jpg)
What’s next
• Dynamic Benchmarking Introduction
• Uses of the Benchmarking Game model
• Software Development (CS 4500)
• A Lesson I’ve learned
If you can read this, you don’t need glasses.
![Page 13: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/13.jpg)
Overhead
• Requires mature Administrator, communication system for accurate results– Reuse between domains is possible
• Requires new translation for each problem domain
![Page 14: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/14.jpg)
Software Development
• Ranks software without a mature benchmark– Dynamic approach excels when a well-
defined benchmark does not exist
• Creates data to build better benchmarks– Because Agents, not Software, are ranked
• Forces developers to consider both their solutions and the problem domain
![Page 15: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/15.jpg)
Education
• Motivates students
• Mature Administrator/Agent not required
• Creates interesting student interaction
• Creates a realistic software development environment
![Page 16: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/16.jpg)
What’s next
• Dynamic Benchmarking Introduction
• Uses of the Benchmarking Game model
• Software Development (CS 4500)
• A Lesson I’ve learned
Yeah, I got nothing.
![Page 17: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/17.jpg)
Specker Challenge Game
• The SCG is the basis for Professor Karl Lieberherr’s Software Development class
• Uses an arity 3 boolean constraint satisfaction problem (CSP) as our domain
• Teams of 2~3 produce the components of an Agent
![Page 18: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/18.jpg)
(Some of the) Skills Involved
• Using outsourced tools– DemeterF (developed by Bryan Chadwick)– Component Market
• Dealing with users– Underspecified requirements
• Source control
• Constraint Satisfaction algorithms
• Data mining
![Page 19: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/19.jpg)
Added bonus
Programmers
Requirements Limitations
Domain Knowledge Experts
Customers
Users
How-to
So what?
Salespeople
Code
GibberishNon-technicalRequirements
![Page 20: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/20.jpg)
It’s a busy class
• Traditional grading would not work
• The competition keeps students motivated
![Page 21: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/21.jpg)
What’s next
• Dynamic Benchmarking Introduction
• Uses of the Benchmarking Game model
• Software Development (CS 4500)
• A Lesson I’ve learned
![Page 22: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/22.jpg)
Administrator Security
• Never accept extra input– Transaction: Challenge: ID, Type, Price– vs.– Transaction: Challenge: ID
• Check all necessary input– Transaction: Deliver Problem: ID, Problem– Check: Does the Problem match the Type?
![Page 23: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/23.jpg)
General Lesson
• Never trust user input– Sanitize data– Protect against buffer overflows
![Page 24: Dynamic Benchmarking Software development though competition Alex Dubreuil Northeastern University dubreuil.a@husky.neu.edu acdubre@gmail.com.](https://reader035.fdocuments.us/reader035/viewer/2022070415/5697bf7a1a28abf838c83341/html5/thumbnails/24.jpg)
More General Lesson
• It’s good to see things before they can do you or others harm– Users you can yell at– Security flaws that don’t cost money– Underspecified requirements