Conceptual Semantic Relationships for Terms of Precalculus ... · Conceptual Semantic Relationships...

10
Conceptual Semantic Relationships for Terms of Precalculus Study VELISLAVA STOYKOVA Institute for Bulgarian Language - IBL Bulgarian Academy of Sciences - BAS 52, Shipchensky proh. str. bl. 17, 1113 Sofia BULGARIA [email protected] MAYA MITKOVA Gulf University for Science and Technology P.O. Box 7207 Hawally 32093 KUWAIT [email protected] Abstract: In this work, we present the results of research using the techniques for semantically-oriented statistical search for extracting mathematical terms for precalculus with applications to education, terminology and ontology. We offer the combination of statistically-based techniques incorporated in the software for extracting keywords, collocations and co-occurrences as the useful tools for teaching courses development, domain terminlogy defini- tions, and building ontologies. Key–Words: Basic Science in Engineering Education, New Technologies in Education, Research and Education, Computers, Internt, Multimedia in Engineering Education, Research and development in Engineering Education. 1 Introduction The engineering education includes compulsory courses of mathematics at a different extend. How- ever, the university content and proportion of different mathematical subjects in courses for university math- ematics may vary depending on the national educa- tional standards and mostly depends on the secondary and university level education. But traditionally, it in- cludes teaching and mastering general topics of pre- calculus. Thus, mastering basic mathematical concepts is a key teaching activity for the improvement of students’ mathematical ability and development of engineering creativity. The problem of designing courses includ- ing teaching precalculus is often considered as a prob- lem of the way the different basic mathematical con- cepts and their order are transposed and introduced to students in the course outline. The problem of creating practical study course materials can be, also, addressed as a process of creat- ing a mathematical terminological conceptual knowl- edge hierarchy which defines the organization and the structure of teaching precalculus course material. Further, we are going to present and discuss the basic principles and the results of a corpus-based re- search and analysis of three different web-based open- source course materials for precalculus study based on the techniques of computer-supported extraction of lexical semantic and terminological database design. 2 Artificial intelligence approaches to terminology Artificial intelligence (AI) methodogies and tech- niques, recently developed, have influenced tradi- tional terminlogical practice in many ways, result- ing in a wide range of real applications. Thus, the use of electronic text corpora of different genres in- stead text archives used in the past, speeded signifi- cantly the production of different terminlogical refer- ence sources like thesauri, specialized dictionaries or encyclopedias offering fast production of updated edi- tions by using proper software tools assisting various types of search. However, the terminolgy still is considered as the area with its own scope, which is different than that of lexicography, and is mostly focused on describ- ing, creating and structuring the domain conceptual knowledge by defining the domain conceptual seman- tic relationships. It, also, has wide range applica- tions with successful results in education by offer- ing approaches to structuring the teaching material or preparing the related school books. At the same time, terminological reference sources are often used as a reliable base for creat- ing the electronic dynamic terminological conceptual knowledge hierarchical on-line database. Thus, we are going to present and analyse the results of a corpus-based research with appli- cations to education with conceptual semantic rela- tions definitions in the domain of precalculus by us- WSEAS TRANSACTIONS on ADVANCES in ENGINEERING EDUCATION Velislava Stoykova, Maya Mitkova ISSN: 1790-1979 13 Issue 1, Volume 8, January 2011

Transcript of Conceptual Semantic Relationships for Terms of Precalculus ... · Conceptual Semantic Relationships...

Conceptual Semantic Relationshipsfor Terms of Precalculus Study

VELISLAVA STOYKOVAInstitute for Bulgarian Language - IBLBulgarian Academy of Sciences - BAS

52, Shipchensky proh. str. bl. 17, 1113 [email protected]

MAYA MITKOVAGulf University for Science and Technology

P.O. Box 7207Hawally 32093

[email protected]

Abstract: In this work, we present the results of research using the techniques for semantically-oriented statisticalsearch for extracting mathematical terms for precalculus with applications to education, terminology and ontology.We offer the combination of statistically-based techniques incorporated in the software for extracting keywords,collocations and co-occurrences as the useful tools for teaching courses development, domain terminlogy defini-tions, and building ontologies.

Key–Words:Basic Science in Engineering Education, New Technologies in Education, Research and Education,Computers, Internt, Multimedia in Engineering Education, Research and development in Engineering Education.

1 Introduction

The engineering education includes compulsorycourses of mathematics at a different extend. How-ever, the university content and proportion of differentmathematical subjects in courses for university math-ematics may vary depending on the national educa-tional standards and mostly depends on the secondaryand university level education. But traditionally, it in-cludes teaching and mastering general topics of pre-calculus.

Thus, mastering basic mathematical concepts is akey teaching activity for the improvement of students’mathematical ability and development of engineeringcreativity. The problem of designing courses includ-ing teaching precalculus is often considered as a prob-lem of the way the different basic mathematical con-cepts and their order are transposed and introduced tostudents in the course outline.

The problem of creating practical study coursematerials can be, also, addressed as a process of creat-ing a mathematical terminological conceptual knowl-edge hierarchy which defines the organization and thestructure of teaching precalculus course material.

Further, we are going to present and discuss thebasic principles and the results of a corpus-based re-search and analysis of three different web-based open-source course materials for precalculus study basedon the techniques of computer-supported extraction oflexical semantic and terminological database design.

2 Artificial intelligence approachesto terminology

Artificial intelligence (AI) methodogies and tech-niques, recently developed, have influenced tradi-tional terminlogical practice in many ways, result-ing in a wide range of real applications. Thus, theuse of electronic text corpora of different genres in-stead text archives used in the past, speeded signifi-cantly the production of different terminlogical refer-ence sources like thesauri, specialized dictionaries orencyclopedias offering fast production of updated edi-tions by using proper software tools assisting varioustypes of search.

However, the terminolgy still is considered as thearea with its own scope, which is different than thatof lexicography, and is mostly focused on describ-ing, creating and structuring the domain conceptualknowledge by defining the domain conceptual seman-tic relationships. It, also, has wide range applica-tions with successful results in education by offer-ing approaches to structuring the teaching material orpreparing the related school books.

At the same time, terminological referencesources are often used as a reliable base for creat-ing the electronic dynamic terminological conceptualknowledge hierarchical on-line database.

Thus, we are going to present and analysethe results of a corpus-based research with appli-cations to education with conceptual semantic rela-tions definitions in the domain of precalculus by us-

WSEAS TRANSACTIONS on ADVANCES in ENGINEERING EDUCATION Velislava Stoykova, Maya Mitkova

ISSN: 1790-1979 13 Issue 1, Volume 8, January 2011

ing semantically-oriented statistical search to extractthe domain conceptual semantic relations.

2.1 Methodologies for term acquisition

AI methodological approaches to terminology use thegeneral termdomain knowledgeto refer to termino-logical analysis. However, the language of a specificdomain, is in fact, the language with the same gram-mar but well-defined domain conceptual semantic re-lations, which open the question for multilingual in-terpretations.

Thus, the AI approaches to terminology mostlyinclude the computer-assisted methods and techniquesfor terms extraction and for defining their internal se-mantic relations.

2.2 Computer-assisted terminology acquisi-tion

The computer-assisted terminology extraction hasbeen the most successful technique recently devel-oped and applied for the creation of higly structuredand semantically-oriented lexical reference sourceslike dictionaries, thesauri, etc. It is based on the ex-tensive use of electronic text corpora assisting varioustypes of search procedures.

There are two general types of corpus-basedapplications - using rule-based search techniques(mostly encoding grammatical relations like inflec-tion [14, 16], syntax, etc.) and using statistically-based search techniques (mostly for extracting the lex-ical semantic relationships [13, 10] to investigate theword behaviour in large-scale electronic text collec-tions. Some hybrid systems allow the combined ap-plication of both techniques, depending on the specificresearch tasks.

Further, we are going to present the combinationof different statistical search techniques to define thesemantics of some basic mathematical concepts forprecalculus by using the statistically-oriented search.

3 The basic concepts in precalculusfor educational use

The mathematics, itself, can be regarded as a specialkind of symbolic language but, nevertheless, it usesnatural language both for definitions and for explana-tions [15]. Mathematical texts, and related teachingcourses, include numbers, formulas, tables, pictures,and graphics by means of which the meaning of theirsemantic content can be fully unterpreted.

In our research, we have used the texts as theyappear in the teaching courses but we have analysedand interpreted only content words since they standfor the concepts.

3.1 Mathematical concepts and their rela-tions

Acquiring mathematical knowledge is a complicatedprocess of mastering the basic concepts by revealingtheir semantic relations in the appropriate order whichis aimed to develop the ability for thinking and cre-ativity.

It success depends on various requirements likethe appropriate curricula design and a related syllabus,the use of highly successful teaching methods - tra-ditionally used and recently developed (including e-Learning and Learning-oriented methods [1] ), thepreliminary students’ ground knowledge, the commonknowledge from related subjects [9], the students gen-der [8], etc. But in general, it is mostly influenced bythe definition and mastering of the basic concepts overwhich the new knowledge is structured.

3.2 Defining the basic concepts by usingword frequency

The general approach we are using for defining thebasic domain concepts is similar to that used for theEU Life Long Learning project KELLY (KEywordsfor Language Learning for Young and adults) wherethe basic concepts are defined by searching the con-trasting electronic multilingual text corpora databaseincluding the British National Corpus (BNC) for ex-tracting keywords frequency lists [6].

Word frequency lists are interpreted by psychol-ogists as a basic concepts for knowledge acquisitionand understanding. Educationalists hold the view thatword frequency lists are playing an important role forlearning to read and mastering similarity. In general,word frequency lists usually defines the basic domainknowledge concepts.

4 Examining contrasting corpora forkeywords by statistical search

Word frequency lists are widely used for various ed-ucational tasks like curriculum development, writingsyllabus and preparing tests for quality evaluation. Sothat, in our research [17] extracting keywords fre-quency list is a task of prime importance.

WSEAS TRANSACTIONS on ADVANCES in ENGINEERING EDUCATION Velislava Stoykova, Maya Mitkova

ISSN: 1790-1979 14 Issue 1, Volume 8, January 2011

Figure 1: Keywords for MathPre and MathWeb.

4.1 Text corpora and keywords definitions

For our research, we have created three electronic textcorpora - MathPre (consisting of precalculus e-coursematerials given at [18]), MathWeb (consisting of pre-calculus e-course materials given at [12]) and Math-Wiki (consisting of precalculus web electronic en-cyclopedic description which reflects the mathemat-ical presentation of precalculus text materials fromWikipedia [11] and which extensively uses the basicmathematical concepts’ relations interpretation intro-duced in [3]) - of almost 200 000 words. Also, theBNC is used as a standard to compare and interpretthe results. In the entire work, we extend the resultsand the analyses based on and already presented in[17].

There are various statistical approaches to definekeywords. In general, most of them define the taskfor keywords extraction as the retrieval and clusteringof statistically similar words [7] and they differ withrespect to the statistics used, i.e. with respect to theway they define the semantic similarity.

For the purpose of our analyses, however, we usethe statistics incorporated in the Sketch Engine (SE)[4, 5] software for processing corpora which allowsthe use of elaborated combined semantically-orientedstatistical approaches and comparison of the resultsbetween several corpora.

The Sketch Engine, also, allows the evaluationof semantic similariry between words based on their

Figure 2: Keywords for MathWiki corpus.

grammatical simmilarity with respect to the relatedinflected word forms or part-of-speech categorizationframes which we did not use for our research.

After using the SE standard statistical options forprocessing our three corpora for keywords definition,we have obtained the results given at Fig. 1 (for Math-Pre and MathWeb) and Fig. 2 (for MathWiki).

The results represents the basic precalculus con-cepts likefunction(s), numbers, polynomials, graphs,equations, etc. They differ with respect to the extendand structure of e-course materials used and with re-spect to the text type of MathPre and MathWeb texts.They, also, reflects the encyclopedic knowlegde pre-sentation structure of MathWiki corpus, which needsa further elaboration to be used for teaching courses.

In general, our keywords frequency lists give thebasic mathematical concepts over which the precal-culus teaching is structured, however, the proportion,their internal conceptual relations, and their orderhave to be clarified by using further statistical researchto define their semantic relations. Further, we are go-ing to define the semantic relationships for the basicconceptfunction(s).

WSEAS TRANSACTIONS on ADVANCES in ENGINEERING EDUCATION Velislava Stoykova, Maya Mitkova

ISSN: 1790-1979 15 Issue 1, Volume 8, January 2011

Figure 3: Concordances of the wordfunction fromMathPre corpus.

Figure 4: Concordances of the wordfunction fromMathWeb corpus.

5 Extracting semantic similarity re-lations using statistical measure-ment

There are different approaches to define the semanticcontext. In fact, the context can be defined both inlogical and in linguistic terms, however, it is highlydependent on particular logical or linguistic theory.At the same time, the so-called context-free grammarshave been evaluated as unappropriate tools for naturallanguage processing.

Moreover, in general, grammar analyses arehighly language-dependent, instead statistically-basedmethods which might be used for multilingual appli-cations. Currently, the statistical corpus approachesbased on the measurement of word similarity anddefining words concordances have been widely usedin terminology and lexicography for word sense defi-nition and for definitions of conceptual semantic rela-tions.

There are various statistically-based approachedover which a semantically-oriented search procedurescan be performed. The related corpus query systemsallow great flexibility of statistically-based search forco-occurrences and collocations using different statis-

Figure 5: Concordances of the wordfunction fromMathWiki corpus.

tical techniques where the context is defined in statis-tical quantitative terms.

5.1 Conceptual semantic relations

In general, conceptual semantic relations are regardedas to be of two types - horizontal and vertical. Thehorizontal (linear) semantic relationships are those ofsynonymy, anthonymy, meronymy, i. e. showing se-mantic similarity [13], semantic distance, inclusion(part-of-whole), etc.

The vertical semantic relationships express therelationships of ordering or hierarchy. The verticalsemantic relationships are those realised by hyper-onymy and hyponymy. All types of semantic relation-ships can be defined by experting the related contextsthrough the generation of related word concordancesbased on the use of different statistical corpus-basedapproaches [10].

The concordances give all occurrences of the tar-get word in its related context wich is generated by us-ing statistical search [13]. The example concordancesfor the basic conceptfunctionreceived from MathPre,MathWeb and MathWiki corpora are given at Fig. 3,Fig. 4 and Fig. 5, respectively.

Concordances define the context in quantitativeterms and a further work is needed to be done to de-fine the semantic relationships by searching for co-occurrences and collocations of the related keyword.

6 Extracting conceptual relations byusing co-occurrences and colloca-tions

Concordances and collocations are defined by statisti-cal measurement of the words which are most proba-bly to be found with the related keyword. They assignthe semantic relations between the keyword and its

WSEAS TRANSACTIONS on ADVANCES in ENGINEERING EDUCATION Velislava Stoykova, Maya Mitkova

ISSN: 1790-1979 16 Issue 1, Volume 8, January 2011

Figure 6: Collocations of the keywordfunction forMathPre corpus.

Figure 7: Collocations of the keywordfunction forMathWeb corpus.

particular collocated word in statistical terms of prob-ability and (or) frequency. The semantic relations, ingeneral, might be of similarity or of a distance.

The statistical approaches we are using to searchfor co-occurrent and collocated words are based ondefining the probability of their co-occurrence andcollocation. We have used the techniques ofT −

score, MI − score [2] and MI3 − score [10] in-corporated in the Sketch Engine for processing andsearching our three corpora.

Basically for all, the following terms are used:N - corpus size,fA - number of occurrences of thekeyword in the whole corpus (the size of the concor-dance),fB - number of occurrences of the collocatein the whole corpus,fAB - number of occurrencesof the collocate in the concordance (number of co-occurrences).

The related formulas for definingT − score ,MI − score andMI3 − score are given at Fig. 9.

Figure 8: Collocations of the keywordfunction forMathWiki corpus.

However, the three statistical criteria give differentconceptual semantic relationships rang lists but all ofthem evaluate sucessfully the basic domain concep-tual relations.

The most likely collocations candidates words(which are the most frequent collocates) for the key-word function are given at Fig. 6 (for MathPre cor-pus), Fig. 7 (for MathWeb corpus), and Fig. 8(for MathWiki corpus). The rang list is estimatedaccording toT − score criterion but the results forMI − score andMI3 − score are listed asd well.

Thus, the most frequent collocates semanticallyrelated to the conceptfunctionarepolynomial, expo-nential, rational, propositional, complex, logarithmic,

2log AB

A B

f N

f f

A BAB

AB

f ff

N

f

3

2log AB

A B

f N

f f

MI-Score

T-Score

MI3-Score

Figure 9: The statistics used forT − score, MI −

score andMI3 − score measurement.

WSEAS TRANSACTIONS on ADVANCES in ENGINEERING EDUCATION Velislava Stoykova, Maya Mitkova

ISSN: 1790-1979 17 Issue 1, Volume 8, January 2011

Rational functionPolynomial

function

Exponential function

Logarithmic function

Trigonometric function

TangentCosine CotangentSine

Complex function

Figure 10: Conceptual semantic hierachy of the key-word function.

trigonometric, etc. They express the hierarchical con-ceptual semantic relationships of the keyword.

Alternatively, the relatively not too frequent col-locations likeperiodic, continuous, inverse, increas-ing, decreasing, real-valued, multi-valued, positive,negative, etc. represent the attributive semantic rela-tionships of the keywordfunction.

6.1 Conceptual semantic hierarchy

Even the most frequent collocations represent the rela-tionship of similarity, they do not necessarily expressthe semantic relationship of synonymy. Mostly, theycan be interpreted in their hierarchical relationships.

Thus, for our research, we are using such inter-pretation, and we analyse thepolynomial functionandexponential functionas the most important concepts tobe mastered in teaching precalculus and therationalfunction as the basic concept to start with. Thelog-arithmic functioncan be presented as the inverse tothe exponential functionand thetrigonometric func-tion can be presented as divided into its subsequentpartssine, cosine, tangent, andcotangent functions.

7 Acquisition of domain terminolog-ical relations

The above described techniques are fully applica-ble for domain terminology acquisition. The resultsachieved, also, can be interpreted by describing anddefining the mathematical terms and their internal se-mantic relations in the domain of precalculus.

Figure 11: Contrasting word frequency lists for Math-Pre and BNC corpora.

However, the most frequent collocations whichexpress similarity do not represent always the seman-tic relationship of synonymy. Thus, thepolynomialfunction, exponential function, and rational functionare analysed as the most important hyponymic con-cepts of the very general hyperonymcomplex func-tion. The hierarchical semantic relations of the con-ceptfunction(s)are presented in Fig. 10.

7.1 Defining the domain terms by searchingcontrasting corpora

As the additional co-occurrences and collocationssearch techniques, the technique for searching con-trasting text corpora for word frequency and compar-ing the results, also, is considered as a useful tool fordefining domain conceptual terms. The technique isused for extracting domain-specific terminology andwe apply it for extracting the mathematical terms inthe domain of precalculus. Hovever, the techniquemight be used for improvement of the results of collo-cation and co-occurrences search techniques.

For our research, we use the three corpora - Math-Pre, MathWeb and MathWiki - and the BNC as con-

WSEAS TRANSACTIONS on ADVANCES in ENGINEERING EDUCATION Velislava Stoykova, Maya Mitkova

ISSN: 1790-1979 18 Issue 1, Volume 8, January 2011

Figure 12: Contrasting word frequency lists for Math-Web and BNC corpora.

trasting corpus. We contrast the search results of themost frequent collocated words for the conceptfunc-tion in every one of our three specialized corpora toBNC (which is accepted as a standard corpus). TheMathPre, MathWeb and MathWiki are considered asspecialized corpora since they are thematically ori-ented to be in the domain of precalculus.

Nevertheless, there are some differences betweenthem. In general, the MathPre and MathWeb consistof e-course materials already prepared to be used forteaching purposes. Whereas, the MathWiki corpus isa specialized web encyclopedic knowledge descrip-tion which is prepared to be used mostly as referencesource.

The first page results of contrastive word fre-quency search of MathPre and BNC are given at theFig. 11. The words which have very low or zerofrequency in the BNC are more likely to be the can-didates terms. Consequently, the words which haverelatively high frequency in the MathPre corpus arealso more likely to be the candidates terms. Thus,the words likepolynomial. trigonometric. asymptote,binomial(s), irrational(s), etc. are more likely to beterms candidates.

Alternatively, the results of contrastive word fre-quency search of MathWeb and BNC are given at theFig. 12. The words liketrigonometric, polynomial,parabola, directrix, ellipse, hyperbola, asymptoteetc.are more likely to be terms candidates.

The results of contrastive word frequency searchof MathWiki and BNC are given at the Fig. 13. Thewords liketrigonometric, polynomial, exponential, bi-nomial(s), hyperbola, ellipse, parabola, Dirichlet se-ries, hypotenuse, etc. are more likely to be terms can-didates.

7.2 Linguistic analysis

The terms extracted represent simple words likeal-gebra, divisor, calculus, axiomatization, exponen-tiation, etc. The compound derivative words likex-intercepts, multi-valued, subinterval, semigroup,interpolant, permutation, double-angle, half-angle,holomorphic, single-valued, etc. Abbreviations likecos, sin. The proper names terms likeDirichlet se-ries, Riemann series, Fourier series, Peano axioms,etc. were extracted mostly from MathWiki corpus.

With respect to the part-of-speech, mostly nounstogheter with phrasals are most frequent. Terms areevaluated on the base of their low frequency or nooccurence in the BNC. Thus, verbs typical for themathematical precalculus texts liketriangularize, ax-iomatize, and subscriptewere not occured in BNC.Also, nouns likehyperbola, trinomial, polynomial,and adjectives liketrigonometric, multiplicative, non-

WSEAS TRANSACTIONS on ADVANCES in ENGINEERING EDUCATION Velislava Stoykova, Maya Mitkova

ISSN: 1790-1979 19 Issue 1, Volume 8, January 2011

Figure 13: Contrasting word frequency lists for Math-Wiki and BNC corpora.

Figure 14: Collocations candidates of the conceptfunction estimated from MathWiki corpus by usingtheMI − score criterion.

negative, and, surprisingly, the termprecalculuswerenot occurred in the BNC as well.

8 Defining highly specialized termi-nology

The former analysis uses extensively varioussemantically-oriented statistical search techniques todefine the basic conceptual relations in the domainof precalculus with application to education. It, also,apply the techniques to extract the terminologicalrelations in the same domain by comparing differentstatistical techniques.

However, the interpretation of the results wasmade mostly by usingT − score criterion, eventhe results are presented, also, forMI − score andMI3 − score criteria. In further description, we aregoing to use theMI − score criterion to evaluate theproper names terms.

As it was outlined in the previous section, theproper names terms were extracted mostly from Math-Wiki corpus. At the same time, the proper namesterms represent, usually, a higly specialized terminol-ogy. Fig. 14 presents the collocations candidates ofthe conceptfunctionestimated from MathWiki corpusby using theMI − score criterion. Alternatively, allcollocated words were not occurred in the MathPreand MathWeb corpora which suggests that they referto highly specialized concepts.

WSEAS TRANSACTIONS on ADVANCES in ENGINEERING EDUCATION Velislava Stoykova, Maya Mitkova

ISSN: 1790-1979 20 Issue 1, Volume 8, January 2011

Figure 15: Co-occurrences of the conceptfunction re-lated to collocatesDirichlet, Fourier andTaylor.

Figure 16: Co-occurrences of the conceptfunction re-lated to its collocateseries.

In fact, the termzeta functionoccurrs on the firstplace followed byDirichlet, Frege, Riemann, Fourier,etc. The conceptfunctionand the collocatesDirichlet,Fourier and Taylor are semantically related throughthe conceptseriesas it is shown at Fig. 15.

Moreover, the semantic relation might be addi-tionaly evaluated by generating the co-occurrences ofthe conceptfunctionrelated to its collocateseries. Theresult is presented at Fig. 16 and it includes the collo-catesDirichlet, Fourier, RiemannandTaylor.

Consequently, the conceptzeta functionand theconceptseriesare related toDirichlet, Fourier andTaylor expressing the specialized terminological rela-tions of the conceptfunction. The general expressionof that relation is the representation of thefunction inseries.

The above conceptual relations represent highlyspecialized mathematical knowledge and are studied

Dirichlet series

Zeta function

Fourier series

Taylor series

Figure 17: Conceptual semantic hierarchy of the con-cepts function related to its collocateseriesand thecollocatesDirichlet, Fourier andTaylor.

at the university level of education. Their conceptualsemantic hierarchy is presented at Fig. 17.

9 Building ontologies

As a very important area of AI, the knowledge rep-resentation techniques use extensively various ap-proaches and techniques used in terminology. In gen-eral, building ontologies is, in fact, the creation ofelectronic dynamic terminological conceptual knowl-edge hierarchical on-line database, and use varioustechniques for semantically-oriented statistical searchto define the co-occurrences and collocations.

The highly specialized proper names terms areusually developed as named entities in the ontolo-gies framework. Thus, extracting basic mathematicalterms of precalculus conceptual relations for buildingontological hierarchy is already defined at Fig. 10.

10 Conclusion and future work

In our research, we have used three related web-basedelectronic corpora consisting of open-source math-ematical texts about precalculus. The final resultsconfirm that it is possible by using the statistically-based software incorporated in the SE by searchingfor keywords to define the basic mathematical con-cepts for teaching precalculus and to refine their con-ceptual hierarchy by searching for collocations andco-occurrences. Also, the methodology has been ex-tended with application to terminology.

The BNC comparative results show very low fre-quency of the basic mathematical concepts for teach-ing precalculus. Surprisingly, we did not found therethe termprecalculusinstead which suggests the ideathat specialised corpora, even of relatively small size,

WSEAS TRANSACTIONS on ADVANCES in ENGINEERING EDUCATION Velislava Stoykova, Maya Mitkova

ISSN: 1790-1979 21 Issue 1, Volume 8, January 2011

are the most appropriate for specialize research in-stead standard ones, widely used for general linguisticresearch.

Further, we are going to continue our work by ex-tending it for building a thesaurus-like conceptual hi-erarhy and by comparing the results for some morelanguages so to test the hypothesis for the language-independent nature of conceptual knowledge.

References:

[1] Bontchev, B. and Vassileva, D. (2011). Learning Ob-jects Types Dependability on Styles of Learning. InN. Mastorakis, V. Mladenov, Z. Bojkovic, eds.Re-cent Researches in Educational Technologies, Corfu,Greece. 227-234.

[2] Church, K. and Hanks, P. (1991). Word AssociationNorms, Mutual Information and Lexicography.Com-putational Linguistics16(1), 22-29.

[3] Hazewinkel, M. (2001).Encyclopaedia of Mathemat-ics.Springer-Verlag.

[4] Kilgarriff, A. and Rundell, M. (2002). Lexical Pro-filing Software and its Lexicographic Applications:a Case Study. InProceedings from EURALEX 2002.Copenhagen. 807-811.

[5] Kilgarriff, A., Rychly, P., Smrz, P., and Tugwell, D.(2004). The Sketch Engine. InProceedings from EU-RALEX 2004.Lorient, France, 105–116.

[6] Kilgarriff, A., Reddy, S., Pomikalek, J. and Avinesh,P. (2010). A Corpus Factory for Many Languages.In N. Calzolari ed.Proceedings of the LREC 2010.Malta. 904-910.

[7] Lin, Dekang. (2002). Automatic Retrieval and Clus-tering of Similar Words. InProceedings of theCOLING-ACL.Montreal. 768-774.

[8] Mitkova, M. (2009). Achievement in Basic MathCourses Reflected Through the Gender Difference. InProceedings of the 4th International Conference onResearch and Education in Mathematics (ICREM4),76-81.

[9] Nitsolov, S. and Mitkova, M. (2011). Teaching Col-lisions - Methodological Suggestions. In N. Mas-torakis, V. Mladenov, Z. Bojkovic, eds.Recent Re-searches in Educational Technologies, Corfu, Greece.106-109.

[10] Oakes, M. (1998).Statistics for Corpus Linguistics.Edinburgh University Press.

[11] Precalculus. (2011).http://en.wikipedia.org/wiki/Precalculus

[12] Precalculus Tutorial. (2011).http://jwbales.home.mindspring.com/precal/

[13] Sparck Jones, K. (1986).Synonymy and SemanticClassification.Edinburgh University Press.

[14] Stoykova, V. (2002). Bulgarian noun – definite ar-ticle in DATR. In D. Scott, ed.Artificial Intelli-gence: Methodology, Systems, and Applications. Lec-ture Notes in Artificial Intelligence 2443, Springer-Verlag, 152–161.

[15] Stoykova, V. (2009). Language-dependent orLanguage-independent e-Learning Mathematics.In Proceedings, Part II, Tempus JEP 41110-2006,TEMIT, 71–83.

[16] Stoykova, V. (2010). Representing Lexical Knowl-edge for Bulgarian Inflectional Morphology inDATR. In N. Mastorakis, V. Mladenov, Z. Bojkovic,and D. Simian, eds.Latest Trends on Computers, vol.2, 612–616.

[17] Stoykova, V. and Mitkova, M. (2011). Defining Lex-ical Semantic Relationships for Terms of PrecalculusStudy. In N. Mastorakis, V. Mladenov, Z. Bojkovic,eds.Recent Researches in Educational Technologies,Corfu, Greece. 240-244.

[18] Topics in Precalculus. (2011).http://www.themathpage.com

WSEAS TRANSACTIONS on ADVANCES in ENGINEERING EDUCATION Velislava Stoykova, Maya Mitkova

ISSN: 1790-1979 22 Issue 1, Volume 8, January 2011