Computing meets Language
Transcript of Computing meets Language
ComputingmeetsLanguageKevinDuh
Dept.ofComputerScience&HumanLanguageTechnologyCOEJohnsHopkinsUniversity
WhatdoesaComputerScientistdo?
ComputerScienceismorethanjustprogramming&computers!
Imagesource:Almonroth,CCBY-SAviaWikimediaCommonshttps://commons.wikimedia.org/wiki/File:Typing_computer_screen_reflection.jpg
ComputationalThinking
Thinkinglikeacomputerscientistmeansmorethanbeingabletoprogramacomputer.
Itrequiresthinkingatmultiplelevelsofabstraction.
JeannetteWing(ColumbiaUniversity)CommunicationsoftheACM,2006https://www.cs.cmu.edu/~15110-s13/Wing06-ct.pdf
Imagesource:WorldEconomicForum,CCBY-SAviaWikimediaCommonshttps://en.wikipedia.org/wiki/File:Jeannette_Wing,_Davos_2013.jpg
Examplesofcomputationalthinkingatwork
• Builda“model”• Abstractsthekeypropertiesofwhatyou’restudying• Allowsyoutorunsimulationsandpredictions
• Examples:• ComputationalBiology• ComputationalFinance• ComputationalLinguistics
Imagesources:(1)Probkos13,CCBY-SAviaWikimediaCommonshttps://commons.wikimedia.org/wiki/File:Punnett_Square.svg(2)Garwood,Sharma,Dunlop,Giribet,CCBYviaWikimediaCommonshttps://commons.wikimedia.org/wiki/File:Phylogenetic_Analyses_of_Opiliones_2014-A.png
Modelinglanguage?
• Yousayyou”knowEnglish”• Whatexactlyisitthatyouknow?• Howwouldyouwriteitdown?Inwhatnotation?
• Howdotoddlerslearntheirfirstlanguage?• Canweprogramacomputertounderstandhumanlanguage?• Exploitlargeamountsofdata&buildprobabilisticmodelsoflanguage(e.g.viamachinelearning)
Outline
1. Introduction:ComputerScienceà ComputationalThinking2. Myfield:ComputationalLinguistics3. Exampleresearchtopic:howGoogleTranslateworks4. HowtobeginCSresearchasahighschooler
ComputationalLinguistics,a.k.a NaturalLanguageProcessing• Wewanttostudy:• Howtomodelhumanlanguage• Howtoprogramcomputerstointerpretandprocesshumanlanguage
• Interdisciplinaryfieldà goodifyoulikebothSTEMandhumanities!• ComputerScience&Engineering• Linguistics,CognitiveScience• Statistics,MachineLearning
Thisisn’teasy!Unlikeprogramminglanguages,humanlanguagecanbeambiguous.
Imagesource:http://walkinthewords.blogspot.com/2010/07/syntax-with-sherlock-sentence-ambiguity.html
Sherlocksawthemanusingbinoculars
Whatapplicationsarepossible?
• Currentlywedon’tyethaveamodelthatreallyunderstandslanguagefully,butwehavesomeusableones
StrongAI vs Weak
AI
Outline
1. Introduction:ComputerScienceà ComputationalThinking2. Myfield:ComputationalLinguistics3. Exampleresearchtopic:howGoogleTranslateworks4. HowtobeginCSresearchasahighschooler
WhenIlookatanarticleinRussian,Isay:“ThisisreallywritteninEnglish,buthasbeencodedinsomestrangesymbols.”
WarrenWeaver,Americanscientist(1894-1978)
Imagecourtesy:BiographicalMemoirsofNationalAcademyofScience,vol.57
1a)evas dlrow-eht
1b)
2a)dlrow-eht si detcennoc
2b)
3a)hcraeser si tnatropmi
3b)
4a)ew eb-ot-mia tseb ni dlrow-eht
4b)
Yourmission:Wefound4sentencepairsfromtwoancientMartianlanguages.Figureoutwhich“word”translatestowhich
1a)evas dlrow-eht
1b)
2a)dlrow-eht si detcennoc
2b)
3a)hcraeser si tnatropmi
3b)
4a)ew eb-ot-mia tseb ni dlrow-eht
4b)
1a)evas dlrow-eht
1b)
2a)dlrow-eht si detcennoc
2b)
3a)hcraeser si tnatropmi
3b)
4a)ew eb-ot-mia tseb ni dlrow-eht
4b)
dlrow-eht
dlrow-eht
3
1
Frequency
si
si
2
1
Lifeinthedayofaresearcher
1. Thinkupanewmodelforlanguagetranslation2. Programit3. Feedthemodellotsofdata4. Testit.Readotherresearcher’spaperstogetmoreideas.5. Gobackto(1)untilsatisfied,thenpublish
Outline
1. Introduction:ComputerScienceà ComputationalThinking2. Myfield:ComputationalLinguistics3. Exampleresearchtopic:howGoogleTranslateworks4. HowtobeginCSresearchasahighschooler
PracticalsuggestionsforgainingComputerScience(CS)researchexperience• Reality:
1. CSisnotjustaboutprogramming,butstrongprogrammingskillisamust!2. TherearemanyresearchareasrelatedtoCS– enoughtofitanyone’s
interest,butalsosomanythatyoumightnotknowwhatisoutthere
• Suggestedplan:1. Improveyourprogrammingskills2. Contactprofessorsforinternopportunities
Improvingyourprogrammingskills
• Pickoneprogramminglanguageandbecomereallygoodatit• e.g.Java,Python,C++,Javascript
• Howtobegood?• Programalot.• Readotherpeople’scode.Workwithafriend,orjoinothers’GitHubprojects• Learnaboutdatastructures&algorithms.TakeComputerScienceclasses(atschoolorCoursera,etc.)
• Createaportfolio onGitHubthatyoucanshowduringapplications
Contactingprofessorsforopportunities
• Writeapoliteemail• Bespecific aboutwhatyouarelookingfor• AddlinktoyourGitHubrepoandexplainyourinterest&experience
• Don’texpectareply• Professorsgetsomanyemailslikethiseverydayfromaroundtheworld….• Professorshavechangingcommitments.No-gothisyeardoesn’tmeannochancefornextyear.
• Ifyou’reluckyandgetaproject:• Beproactive infiguringouthowyoucancontribute.• Becomfortableworkingonsomethingwhenyoudon’tknowallthedetails.• Beindependent.Learnwhentoaskquestionsandwhentoself-study.
Additionalcomments
• Structuredinternshipprogramsarealsogoodwaystolearn,e.g.• JohnsHopkinsAppliedPhysicsLab(APL)ASPIREprogram• Moreresources:https://cty.jhu.edu/resources/academic-opportunities/internships/math.html
• IfinterestedinMachineLearning&AIsubareasofCS,thenmathandprobability/statisticsarealsoimportant.
Summary
1. Introduction:ComputerScienceà ComputationalThinking2. Myfield:ComputationalLinguistics3. Exampleresearchtopic:howGoogleTranslateworks4. HowtobeginCSresearchasahighschooler