ARTIFICIALLY INTELLIGENT VIRTUAL ASSISTANT P ... · PDF filefor identifying input and output...

ARTIFICIALLY INTELLIGENT VIRTUAL ASSISTANTP.DhivyaBharathi (Assistant Professor/CSE)1 D.J.Srihema2 R.Monisha3

V.Divya4 B.Veenitha5

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERINGEr.PerumalManimekalai college of Engineering, Hosur-635117.

[email protected]

Abstract :Systems today are gettingexpert day by day and intend to help humanin their day to day queries. Today AI ispresent in a variety of fields ranging fromindustries in manufacturing, to diagnosis inmedicine technology, to customer care inpublic relations. There exist lots of onlineArtificial Intelligence (AI) assistants thathelp people solve their problems. So, here inthe implemented system we built an AI thatwill solve college related query. It's like asmall scale college oriented intelligentsearch engine. The implemented system isbasically a Virtual Assistant that is strictlycollege oriented. The implemented systementertains the queries of a student regardingcollege related issues. The implementedsystem was constructed Using VisualBasic .Net.

Keywords - Virtual Assistant, Bot, Chat,College, Assistant, Academic, AIML

1. Introduction It is very easy to interact with system even ifuser doesn’t know how to read and write so,it becomes incentive for user to use thissystem by using natural language. Themodel receives response from user either inform of text or in form of speech. Then it

responds to the user in both text as well assound format. One of the most importantgoals is to enhance the interaction realnessof these systems.The concept of “Alice” wasthe first chatterbot which receives questionfrom user based on pattern recognition.Large amount of sentences were involved totrain the model.catchy interfaces usinghuman-like avatars are provided for thisreason which are capable to adapt theirbehavior according to theconversationcontent. This kind of agents interactvocallywith users by the use of AutomaticSpeechRecognition (ASR) and Text To Speech(TTS) systems. Their“emotions” changeaccording to the sentences entered by theuser.Facial movements are simulated in the3D talking headby rational free form.As thename of implemented system suggests, thevision was to develop a virtual assistant forcollege students to solve college basedqueries and provide college relatedinformation. The term ‘Virtual’ may bedefined as “Not physically existing as suchbut made by software to appear to do so.”Thus, the implemented system is a virtualassistant, i.e., it is a software that usesartificial intelligence to guide a student andtakes actions to effectively understandstudent queries and respond to themrationally.

The implemented system can assist

mailto:[email protected]

vts-6

Text Box

International Conference on Emerging trends in Engineering, Science and Sustainable Technology (ICETSST-2017)

vts-6

Text Box

E-ISSN: 2348 – 8387 www.internationaljournalssrg.org 00

vts-6

Text Box

the user by solving queries related tocollege.

2. EHeBby architectureThe system is composed of two majorfactors, as shown in figure 1, a reasonermoduleand a Talking Head (TH)module.The reasoner processes the userquestion by means of the A.L.I.C.E.(Artificial Linguistic Internet ComputerEntity)engine. ALICE was implementedwith pattern matching algorithm which is assimple as string matching technique.ALICEtakes the input in the text form and producesoutput in text form like question and answerbased system.Whereas another chatter botwhich was build earlier known as Elizabethrequires set of input rules, keyword patternsfor identifying input and output rules toproduce required response is not used in thissystem since it is complicated.in order toproduce the correct information needed forthe talking head animation the emotionalarea allows the chatbot to elaborateinformation related to the produced answerand a correspondent humor level.The THsystem relies on a web application where aservlet selects the basis facial meshes to be

animated and integrated to process emotioninformation.On the client side, all these dataare used to actually animate the head.twoanswers can be combined by ALICE in thecase of splitting during the normalizationprocess and by recursive process.Patternmatching process is the general feature ofALICE chatter bot, which is simple anddepth-first search algorithm is used where itresults in producing no output also.

Fig. 1. EHeBby Architecture

3. Proposed WorkThe architecture of bot system is dividedinto following modules:i) AIML

ii) Microsoft Speech recognition

iii) Corpus

iv) Brain

v) Bot engine

i). AIMLArtificial intelligence markup language is anextension of XML (extensible markuplanguage). The most important parts of anAIML documents are: a) <aiml>

b) <category>

c) <pattern>

d) <template>AIML consists of aiml objects and theseaiml objects consist of topics andcategorieswhich contain either parsed orunparsed data from their information isextracted.

3.1 AIML KBAIML categories:

1. To manage a general conversation

vts-6

Text Box

E-ISSN: 2348 – 8387 www.internationaljournalssrg.org Page 101

vts-6

Text Box


2. To generate humorous sentences bymeans of jokes.

3. To retrieve humorous or funnysentences through the comparisonbetween the user input

4. To recognize an humorous intent inthe user sentences.

3.2 Humour recognition areaThe humour recognition consists ofidentification, inside the user sentences, ofparticular humorous text features. We focuson three main humorous features:alliteration, antinomy and adult slang.

3.2.1 Alliteration-This technique isaimed at discovering possible repetitions ofthe beginning phonemes in subsequentwords.EXAMPLE: Veni, Vidi, Visa: I came, I saw,I did a little shopping Infants don’t enjoyinfancy like adults do adultery

3.2.2 Antinomy- This module detects thepresence of antinomies in a sentence whichhas been developed exploiting the lexicaldictionaryWordNet.EXAMPLE: A clean desk is a sign of acluttered desk drawer Artificial intelligenceusually beats real stupidity

3.2.3 Adult slang recognitionmoduleThis module analyzes the presence of adultslang searching in a set of pre-classifiedwords.EXAMPLE: The sex was so good that eventhe neighbors had a cigarette ArtificialInsemination: procreation withoutrecreation

ii). Natural language Processing Before speech recognition can work, therearises need of natural language processing.One need to understand how languageshould be interpreted as there is ambiguityin processing any language as shown in fig.2

Figure 2:Processing of NLP

A huge amount of information or corpus isrequired such that an accurate NLP enginecan be build. After obtaining the corpus,sentence chunking is done so that eachsentence can be converted to speech orvoice. STT is speech to text, where sound orvoice which is received core engine processthese sentences depending upon the situationand it is moved forward to speechsynthesizer to respond. Processed separately.They are further divided into named entityrecognition

iii). Microsoft Speech RecognitionThis speech recognition consists of TTS andSTT. TTS is text to speech system wheretext is converted to speech or voice. STT isspeech to text, where sound or voice whichis received from the user is converted to textfor processing of information by bot.Initially, sentence given or received ispassed to NLP and after that, NLP coreengine process these sentences dependingupon the situation and it is moved forward tospeech synthesizer to respond.

vts-6

Text Box


vts-6

Text Box


i). Corpus -To respond accurate to questionor in any situation given to the bot.ii). Brain -The brain file size is aroundninety mega bytes which is compressedform of corpusiii). Bot Engine –consists of followingcomponentsa) Kernel b) AIML Parser c) Utility Managerd) Word Substitution e) Pattern Manager

3.3 Humor evocation areaThe humor evocation area computes thesemantic similarity between the sentencesaid by the user and the encoded sentence insemantic space; also it tries to answer to theuser with a funny expression which isconceptually close to the user input. Thisprocedure allows to go beyond the firmpattern-matching rules, generating thefunniest answers which semantically fit theuser query.

3.3.1 Semantic space creationA semantic representation of funnysentences has been obtained mapping themin a semantic space. The semantic space has

been built according to a Latent SemanticAnalysis (LSA) based approach described inAgostaro (2005)Agostaro (2006). we use thefollowing similarity measure Agostaro(2006):

The closer this value is to 1, the higher is thesimilarity grade. The geometric similaritymeasure between two items establishes asemantic relation between them. Inparticular given a vector s, associated to auser sentence s, the set CR(s) of vectors sub-symbolically conceptually related to thesentence s is given by the q vectors of thespace whose similarity measure with respectto s is higher than an experimentally fixedthreshold T.

Thachatterbot can also develop its ownAIML KB mapping in the suggestive areanew items like jokes, riddles and so onintroduced by the user during theconversation.

3.4 Emotional areaThe generation of emotional expressions inthe Talking Head suits emotional area. Thereare many possible models of emotions inliterature. They can be distinguished in threedifferent categories of models. The first onerefers the models describing emotionsthrough collections of different dimensions (arousal, , unpredictability, valence, potency,intensity ...). The second one refers modelsbased on the hypothesis that a human beingis able to express only a limited set ofprimary emotions. All the range of thehuman emotions should be the result of the

combination of the primary ones. The lastcategory includes mixed models, accordingto which an emotion is generated by amixture of basic emotions parametrized by aset of dimensions.

4. EHeBby talking headOur talking head can speak severallanguages since it is conceived to be a multi-platform system. so that variousimplementations have been realized. Thedifferent components of our model arepresented: model generation, animationtechnique, coarticulation, and emotionmanagement.

vts-6

Text Box


vts-6

Text Box


The face model generation involves aspecial tool called FaceGen which is mainlyused for the creation of 3D human heads andcharacters as polygon meshes. Numericalparameters are used to control the facialexpressions. A variety of facial expressionscan be obtained since theintensity of eachemotion can be controlled by a parameter ormixed to each other.Morphing is done for facial expressions toexpress the animation. It starts fromgeometrics called keyframes. Thekeyframes vertex translates from its positionto occupy one of the corresponding vertex inthe subsequent keyframe. A set of visemesare generated for this reason instead ofmodifying a single head geometricmodel.This approach is less efficient thananimation engine in case of modifying thefacial expressions like tongue position,labial protrusion and so on.

4.1 Cohen-Massaro model

This is a mathematical model used forcoarticulation. This computes the weights tocontrol the keyframe animation. the vertexespositions of an intermediate mesh betweentwo keyframesare determined by theseweights. Dominance is the name given to theinfluence on the organs of articulation of theface rather than perpendicular segments andcan be mathematically defined as a timedependent function.

wherea is the peak for t = 0, q and c controlthe function slope and t is the time variablereferred to the mid point of the speechsegment duration. The coarticulation can bethought as composed by two sub-phenomenons: the pre- and post-articulation. The prosodic sequence S oftime intervals [ti−1,ti[ associated to eachphoneme can be expressed as follows:

A viseme is described as “active” when ‘t’falls into the corresponding time interval.The preceding and the following visemes aredescribed as “adjacent visemes”. Theweights are computed as follows:

whereti the mid point of the i-th timeinterval. The wi must be normalized:

The coordinates of the interpolating viseme

vertexes v(l)int(t) ∈{Vint(t)} will be

computed for each time instant as follows:

the index l indicates corresponding verticesin all the involved keyframes. Also thiscomputation is simplified by ourimplementation.

4.1.1 Dominant visemes andDiphthongs

vts-6

Text Box


vts-6

Text Box


A sequence of two adjacent vowels is calleddiphthong. The vowels in a diphthong mustbe visually distinct as two separate entities.The visemes belonging to the vowels in adiphthong mustnot influence each other.Otherwise, both the vowel visemeswouldnotbe distinguishable due to theirfusion.Thedominance of a vowel withrespect to a consonant is accomplished witha less steep curve than the consonant one asshown in the figure. Fig 3- The dominance function for the

diphtong case (a) and the weightsdiagram (b) for the diphthong caseFig. 4. The same of Fig.2 for the vowel-consonant case

4.2 Audio stream synchronizationFacial movements are driven by the .phofile since the Talking Head is essentiallysynchronized with the audio streamingwhich determines the phoneme (viseme) andits duration. a variety of options to producethe prosody for the language and speechsynthesizer to use is provided by Espeak.

5. Some examples of interaction5.1 Example of humorous sentencesgenerationThe following is an example of an humorousdialogue

User: What do you think about robots?EHeBby: Robots will be able to buyhappiness,but in condensed chip form!!

<category><pattern>WHAT DO YOU THINK ABOUTROBOTS</pattern><template>Robots will be able to buyhappiness,but in condensed chip form!!</template>

< /category>The pattern delimits what the user can say.Every time the pattern is matched,thecorresponding template is activated.

5.2 Example of humor recognitionThe recognition of humorous sentences isobtained using specific tag inserted into thetemplate, as shown in the followingcategories<category><pattern>CAN I TELL YOU AJOKE</pattern><template>Yes you can</template>< /category><category><pattern>*</pattern>

vts-6

Text Box


vts-6

Text Box


<that>YES YOU CAN</that><template><srai><humorlevel><star/></humorlevel><srai></template>< /category>If the previous answer of the chatterbot was“Yes you can” then the second category isactivated .The humor level can assume threedifferent values, low, medium and high.Thefollowing example shows the categoryrelated to a high humor levels <category><pattern>HIGH *</pattern><template><think><prosody>

<star/></prosody></think><joy intensity="080" /></template></category>

Fig. 5. TH reaction to a funny joke

5.4 AIML categories for targetingThe humorous chatterbotwill update its ownsub-symbolic data through a targetingprocedure, which will map new acquiredriddles in the semantic space. By means ofthe ad-hoc created AIML tag addRiddle,thetargeting is obtained as shown in thefollowing chunk of AIMLcode:<category><pattern>Listen this joke *</pattern><template><humorlevel><star/></humorlevel><think><addRiddle><star/></addRiddle>< /think>< /template>< /category>A sentence introduced by the user as avector in thesemantic space is effected tocode by theaddRiddle tag has by means ofthe folding-in method. Think tag hides theentire procedure to theuser. In this manner,the user will see only the chatterbot’sresultto his joke..

Procedure 1.) A brain file is created initially, by usingaiml file.

2.) If brain file is not present then create abrain.

3.) If brain file is present then load the brainfile in the model.

4.)when the brain is loaded, bot waits for therequest from user or client.

5.) The request is provided by the user eitherin the form of voice or text.

6.) Then the model receives request andforwards to pattern manager

7.) Pattern matching algorithm is appliedand sent to the brain.

8.) The model uses the brain and get suitableresponse and forwards it to the user

vts-6

Text Box


vts-6

Text Box


Flow Diagram of Request and Response

6. ResultsTwo appropriate data sets have been createdin order to validate the humor recognition.Theformer, called DataSet1, is composed of 100humorous phrases extracted by ad-hocselectedweb sites, characterized by the presence of apercentual of humoristic features, as shownin table 1.

Table 1:Humoristic Features Distributionin DataSet1The latter, called DataSet2, is composed of200 phrases, equally distributed betweenhumorousand non-humorous sentences, where thenon-humorous examples are alwaysextracted

from Internet and chosen among titles ofsets definitions ,proverbs and newspapers.Theimplemented algorithms are well performedin humor recongition, as shown in table 2.

Table 2: Humor Recognition in theanalyzed DataSets

The results will be considered satisfactory.Moreover if the humor recognition areacannotidentify the humorous level of sentencesgiven by the user, the dialogue can continuein afunny way. In fact, the chatterbot exploitsthe semantic space, which allows retrievingentertainingsentences so that the dialogue can continuein a funny approach.

7. ConclusionChatbot is proved to be a boon in anyindustry where one can look for assistance

vts-6

Text Box


vts-6

Text Box


without need of human. Bot itself acts ashuman mind to provide correct information.It can be better as a teacher in education. Itcan provide appropriate details of medicinesto be consumed for particular disease inhealthcare industries and for traveler it canbe a guide assistance to provide him shortestpath for travelling particular with minimumexpenses.The whole architecture relieson asuitable AIML-based chatterbot, and an

animation engine for the talking head.Thechatbotreasoner module is based on anextended AIML architecture dealt with usingsuitable tags. The whole system has beentested on the humor recognition task withsatisfactory results. However, our system iscurrently under development and muchwork has to be done in order to improve thewhole architecture.

Referrence:1.Mihalcea R. and C.Strapparava. (2006)Learning to laugh (automatically):ComputationalModels for Humor Recognition. ComputerIntelligence, Volume 22, 2006MultiWordNet (2010):http://multiwordnet.itc.it/english/home.phphttp://www.oneliners-and-proverbs.com/ andhttp://www.bdwebguide.com/jokes/1linejokes-1.htm.2.Waters K, Levergood T M. An automaticlip-synchronization algorithm for syntheticfaces.s.l. : MULTIMEDIA ’94 Proceedings of thesecond ACM international conference onMultimedia ISBN:0-89791-686-73.Weizenbaum, J. ELIZA – A computerprogram for the study of natural languagecommunication between man and machine.Communications of the ACM, 10(8):36–45(1966). 4.Cliff, C. and Atwell, A. Leeds Unixknowledge expert: a domain-dependentexpert system generated with domain-independent tools. BCS-SGES: BritishComputer Society Specialist Group onExpert Systems journal, 19:49–51 (1987). 5. Wallace, R., Tomabachi, H., and Aimles,D. Chatterbots go native: Considerations for

an eco-system fostering the development ofartificial life forms in a human world.Published (2003), online:http://www.pandorabot.com/pandora/pics/chatterbotsgonative.doc. 6.Bran, E. Chabot’s in derKundenkommunikation (Chabot’s incustomer communication). Springer, Berlin(2003). 7..Abu Shawer, B. and Atwell, A Using thecorpus of spoken Afrikaans to generate anAfrikaans Chabot SALALS Journal:Southern African Linguistics and AppliedLanguage Studies, 21:283–294 (2003). 8.Abu Shawer B. and Atwell A. Differentmeasurement metrics to evaluate a Chabotsystem Proceedings of the NAACL'07Workshop: Bridging the Gap: Academic andIndustrial Research in Dialog Technologies.Pp.89-96, ACL(2007). 9.Abu Shawer B. and Atwal A. MachineLearning from dialogue corpora to generatechatbots. Expert Update journal, Vol. 6, No3, pp 25-30 (2003). 10. Steven Birds, Ewan Klein and EdwardLapper. Natural Language Processing withPython by O'Reilly Media, Inc. (2009). 11.Kalra P, Mangili A., Magnetat-ThalmannN, Thalmann D. Simulation of facial muscleactions

http://www.bdwebguide.com/jokes/1linejokes-1.htm

http://www.bdwebguide.com/jokes/1linejokes-1.htm

vts-6

Text Box


vts-6

Text Box


based on rational free form deformations.SCA ’06 Proceedings of the 2006 ACMSIGGRAPH Eurographics symposium onComputer animation ISBN:3-905673-34-712.Löfqvist, A. (1990) Speech as audiblegestures. InW.J. Hardcastle and A. Marchal(Eds.) SpeechProduction and Speech Modeling.Dordrecht: Kluwer Academic Publishers,289-322.13.Lee Y, Terzopoulos D, Waters K.Realistic modeling for facial animation.Proc. ACM

SIGGRAPH’95 Conference, Los Angeles,CA, August, 1995, in Computer GraphicsProceedings, Annual Conference Series,1995, 55-62.14.Liu K, Ostermann J. Realistic TalkingHead for Human-Car-EntertainmentServices. IMA2008 Informationssystemefür mobileAnwendungen, GZVB e.V. (Hrsg.), pp. 108-118, Braunschweig, Germany15. Jonas Sjobergh and Kenji Araki. A VeryModular Humor Enabled Chat-Bot forJapanese. Pacling 2009

vts-6

Text Box

E-ISSN: 2348 – 8387 www.internationaljournalssrg.org 9

vts-6

Text Box


ARTIFICIALLY INTELLIGENT VIRTUAL ASSISTANT P ... · PDF filefor identifying input and output...

Documents

Transcript of ARTIFICIALLY INTELLIGENT VIRTUAL ASSISTANT P ... · PDF filefor identifying input and output...