Natural Language Processing with Per
Click here to load reader
-
Upload
jaganadh-gopinadhan -
Category
Technology
-
view
1.356 -
download
7
Transcript of Natural Language Processing with Per
FossConf 2008 Chennai
Natural Language Processing with Perl
G Jaganadh C-DAC Thiruvananthapuram
FossConf 2008 Chennai
Talk Overview
Introduction
Natural Language Processing
Perl
Perl Lingua Modules
Some examples
Towards future
FossConf 2008 Chennai
Introduction
•Objectives of the talk
Introducing NLP techniques for Language Researchers
FossConf 2008 Chennai
Natural Language Processing
Introduction to NLP
Sub fields in NLP
FossConf 2008 Chennai
Perl
•Practical Extraction and Report Language
Free and Open Source
Easy to Learn
Powerful regular Expressions for text searching
FossConf 2008 Chennai
Perl Lingua Modules
Perl Modules for Linguistic Processing
All most all modules are for English Dutch and other
European Languages
Powerful implementation of different NLP algorithms
FossConf 2008 Chennai
Some Examples
Counting words in a text
Pattern Matching
Use of Lingua::EN::Sentence
Use of Lingua::EN::NamedEntity
FossConf 2008 Chennai
Counting words $text = <>;while ($line = <>) { $text .= $line;}#$text =~ tr/a-z��������A-Z���������\n/cs;@words = split(/\n/, $text);for ($i = 0; $i <= $#words; $i++) {
if (!exists($frequency{$words[$i]})) {$frequency{$words[$i]} = 1;
} else {$frequency{$words[$i]}++;
}}foreach $word (sort keys %frequency){
print "$frequency{$word} $word\n";}
FossConf 2008 Chennai
Lingua::EN::Sentence
#!/usr/local/bin/perl -wuse Lingua::EN::Sentence qw( get_sentences add_acronyms );## adding support for abbreviationsadd_acronyms('lt','gen');$/ = "\n\n";
while(<>) { $sentences=get_sentences($_); foreach $s (@$sentences) {
print "<s> $s </s>\n"; }}
FossConf 2008 Chennai
Lingua::EN::NamedEntity
#!/usr/bin/perluse strict;use Lingua::EN::NamedEntity;while (<>) {my $str = join '\n',<>;#my $str = join '\n',<INP>;my @entities = extract_entities($str);foreach my $entity (@entities) {
print $entity->{entity},"\n";}
}
FossConf 2008 Chennai
Pattern Matching
while ($line = <>) {
if ($line =~ m/_____/ ) {
print $line ;
}
}
FossConf 2008 Chennai
Toward future
Lingua Modules for Indian Languages
Useful Stuff•http://search.cpan.org/search?query=Lingua&mode=all http://wiki.christophchamp.com/index.php/Perl/Modules/Lingua
FossConf 2008 Chennai
Question ?
FossConf 2008 Chennai