Windows 7 64 java envirenment build

13
Run Berkeley parser on Windows 7 (64bit) & java environment building recorded by Aaron [http://www.linkedin.com/in/aaronhan] 1. Download JDK, JDK is a development environment for building applications, applets, and components using the Java programming language, from http://www.oracle.com/technetwork/java/javase/downloads/jdk7- downloads-1880260.html2. Install “jdk-7u45-windows-x64.exe” 3. Configure the environment variables for windows 7 (64-bit). [reference: http://blog.163.com/wutianshui@126/blog/static/1869346220099455 115417/] Go to your computer -> property - > higher system setting -> environment variable

description

 

Transcript of Windows 7 64 java envirenment build

Page 1: Windows 7 64 java envirenment build

Run Berkeley parser on Windows 7 (64bit) & java environment building

recorded by Aaron

[http://www.linkedin.com/in/aaronhan]

1. Download JDK, JDK is a development environment for building

applications, applets, and components using the Java programming

language, from

“http://www.oracle.com/technetwork/java/javase/downloads/jdk7-

downloads-1880260.html”

2. Install “jdk-7u45-windows-x64.exe”

3. Configure the environment variables for windows 7 (64-bit).

[reference:

http://blog.163.com/wutianshui@126/blog/static/1869346220099455

115417/]

Go to your computer -> property - > higher system setting -> environment

variable

Page 2: Windows 7 64 java envirenment build
Page 3: Windows 7 64 java envirenment build
Page 4: Windows 7 64 java envirenment build

New -> Type “classpath” in the variable name, “C:\Program

Files\Java\jdk1.7.0_45\lib\dt.jar;C:\Program Files\Java\

jdk1.7.0_45\lib\tools.jar;.;” in the value of variable. Press the confirm

button.

New -> Type “java_home” in the variable name, “;C:\Program

Files\Java\jdk1.7.0_45” in the end of previous value of variable. Press the

confirm button.

Page 5: Windows 7 64 java envirenment build

Select the “path” variable from the existing list, press edit button, type

“;%JAVA_HOME%\bin” in the end of the previous value.

4. Build “Hello world” file to test your environment.

Go to the desktop, build the file “HelloWorld.java”, put the following

content into the file:

--------------------------the following hello world java code----------------

public class HelloWorld {

Page 6: Windows 7 64 java envirenment build

public static void main(String[] args) {

System.out.println("Hello, World");

}

}

-----------------------------------------------

Open the cmd window, type the following code:

cd desktop

java HelloWorld

If it shows the following words, succeed!

Begin the new life of java programming?

5. Download Berkeley parser

(https://code.google.com/p/berkeleyparser/downloads/list).

Download parser file “BerkeleyParser-1.7.jar”

Download English grammar file “eng_sm6.gr”

6. Prepare segmented English corpus.

In the same directory of the downloaded tools, build an English

example file named as “sampleenglish.seg”, and put this sentence

“this is a small house .” into this file.

Page 7: Windows 7 64 java envirenment build

7. Run the Berkeley English parser for the first time.

If you put the downloaded file in the directory “E:\Berkeley Parser”

Goto the “cmd” window, type the following commands.

E:

cd “Berkeley Parser”

java -jar BerkeleyParser-1.7.jar -gr eng_sm6.gr -inputFile

sampleenglish.seg -outputFile englishout.txt –render

The above command “-render” is to generate the parsed picture.

Wait for one moment, then it will generate the files “englishout.txt”

and “thisisasmallhouse.png”. In the “englishout.txt” file there is the

parsed sentence “( (S (NP (DT this)) (VP (VBZ is) (NP (DT a) (JJ small)

(NN house))) (. .)) )”, and in the .png file, there is the parsed picture.

===========================================

Train English grammar:

Run Berkeley parser using “WSJ” English corpora. The “.mrg”

documents contain the fully tagged English sentences while the “.prd”

documents do not contain the POS tags of the words. Here we first use

the fully tagged “.mrg” documents.

Using the following command to training the English grammar:

After around 8 hours, it finished the training as below:

Page 8: Windows 7 64 java envirenment build

### above name is not suitable. It should be “TrainedGramEng.WSJ”

instead of TrainedGramChi.CTB7.

Page 9: Windows 7 64 java envirenment build

Above running information shows the default training documents

used by the Berkeley.parser. for WSJ corpora are from the document

ID number 200 to the ID 2199.

The following command can be run, which means that the

Berkeley.parser for English WSJ corpora is defaulted for the “.mrg”

documents instead of the “.prd” documents.

the begging of the running information is as below:

Page 10: Windows 7 64 java envirenment build

Training began from 2013-11-10-16:30; finished at

Store the running details of training grammar by the command

“command…. > store.file.name” after training finished.

=====================================

Testing:

[[To the test the performance of a grammar you can use:

java -cp berkeleyParser.jar edu.berkeley.nlp.PCFGLA.GrammarTester -

path <WSJ location> -in <grammar-file>]]

the testing command of the trained grammar:

Page 11: Windows 7 64 java envirenment build

The testing score shows that:

To record the running details of the testing stage, use the command:

=============================================

Another training and testing of English grammar:

The command of the training is below:

After around xx hours, the training of the grammar finished as below:

Page 12: Windows 7 64 java envirenment build

Export the training record using below command, this will run the

training once more, so you can use this command when you begin the

training:

The testing command:

The result shows the scores as below, which are the same values with

last testing run:

Page 13: Windows 7 64 java envirenment build