Fu-Hsiang’s research log

Post on 05-Jan-2016

20 views 0 download

description

Fu-Hsiang’s research log. 2004/1~2004/6. Half-year plan. 2004/1/5~2004/1/11. Works done last week Re-test for external and internal testing with 5KB white page Process pornography homepage text Works planned for this week Process pornography homepage text and try to training these data - PowerPoint PPT Presentation

Transcript of Fu-Hsiang’s research log

Fu-Hsiang’s research log

2004/1~2004/6

2

Half-year plan

識別碼 工作名稱

一月 2004 二月 2004 三月 2004 四月 2004 五月 2004

4/1 11/1 18/1 25/1 1/2 8/2 15/2 22/2 29/2 7/3 14/3 21/3 28/3 4/4 11/4 18/4 25/4 2/5 9/5

Training Keyword by N-gram

1 Benchmark

3 Benchmark (Accurate Testing)

4 Implement algorithm into DG

5 Benchmark & Evaluation

16/5

2

23/5

6 Thesis Writing

3

2004/1/5~2004/1/11

(a) Works done last week• Re-test for external and internal testing with 5KB white page• Process pornography homepage text

(b) Works planned for this week • Process pornography homepage text and try to training these data • Modify external and internal testing results

(c) NotesLast week I retested for external and internal testing with 5KB web page, but

there had some mistakes in the results. Therefore, I will modify the external results slides and analysis the internal results to make new slides.

4

2004/1/12~2004/1/16

(a) Works done last week• Process pornography webpage text and try to training these data• Modify external and internal testing results

(b) Works planned for this week • Select and put the keywords that N-gram training into the DansGuardian • Modify slides

(c) NotesLast week I processed the webpage text and trained the webpage text by N-gra

m tool. Therefore, I got many keywords for 2-gram, 3-gram, 4-gram, 5-gram and 6-gram. I will select the keywords and put into the DG this week.

5

2004/2/2~2004/2/6

(a) Works planned for this week • Prepare the 預官 exam• Select and add the Chinese keywords into DansGuardian

(b) NotesI will select the Chinese keywords and put into the DansGuardian this week.

6

2004/2/6~2004/2/13

(a) Works done last week• 預官 exam• Sieved out the proper Chinese keywords from 2~6gram

(b) Works planned for this week • Finish sieve out the Chinese keywords from training data • Add the keywords with scores into DansGuardian

(c) NotesLast week I sieved out some of the proper Chinese keywords from 2gam to

6gram. However, the keywords is too much. I’ll continue sieving out the Chinese keywords from training data and add them into DansGuardian with scores this week.

7

2004/2/16~2004/2/20

(a) Works done last week• Finish sieve out the Chinese keywords from training data• Add the keywords with scores into the DansGuardian• Benchmark the DansGuardian with Chinese keywords

(b) Works planned for this week • Adjust the Chinese keywords’ scores in the DansGuardian• Implement the Early blocking & Early bypassing algorithm into the DansGuardian

(c) NotesLast week, I have completed the work of sieve the Chinese keywords and added them int

o DansGuardian and did the accurate benchmark. I will adjust the scores of the Chinese keywords in the DansGuardian and implement the Early blocking and Early bypassing algorithm into the DansGuardian this week.

8

2004/2/23~2004/2/27

(a) Works done last week• Adjust the scores of Chinese keyword in the DansGuardian• To Analyze the program of DansGuardian

(b) Works planned for this week • Benchmark the DansGuardian accuracy after adjusting the scores• Implement the Early blocking & Early bypassing algorithm into the DansGuardian

(c) NotesLast week, I spent much time on understanding of the code of DansGuardian and spent l

ess time on adjusted the scores of the keywords. This week, I will try to add codes into the DansGuardian and benchmark the DansGuardian accuracy again. Furthermore, I will spend two week to implement the Early blocking & Early bypassing algorithm into the DansGuardian and spend two week to put codes into the webfd.

9

2004/3/1~2004/3/5

(a) Works done last week• Benchmark the DansGuardian accuracy after adjusting the scores• Implement the Early blocking & Early bypassing algorithm into the DansGuardian

(b) Works planned for this week • Implement the Early blocking & Early bypassing algorithm into the DansGuardian• Benchmark the DansGuardian with Early blocking & Early bypassing

(c) NotesLast week, I did DansGuardian accuracy benchmark with new scores and modified the D

ansGuardian but still had many problems. This week, I will try to solve these problems (separated compute scores, blocking threshold, bypassing threshold..).

10

2004/3/8~2004/3/12

(a) Works done last week• Finish implement the Early blocking & Early bypassing algorithm into the DansGuardian• Benchmark the DansGuardian latency with Early blocking & Early bypassing

(b) Works planned for this week • Re-benchmark the latency of DansGuardian with Early blocking & Early bypassing• Thesis outline

(c) NotesLast week, I put the Early blocking & Early bypassing codes into the DansGuardian succe

ssfully and did the latency benchmark, but there were some problems in the results. This week, I will re-test the latency of DansGuardian with Early blocking & Early bypassing algorithm by putting code into the DansGuardian. Furthermore, I must prepare the thesis outline before Friday.

11

2004/3/15~2004/3/19

(a) Works done last week• Thesis outline• Re-benchmark the latency of the DG with Early blocking & Early bypassing

(b) Works planned for this week • Implement the modified Early blocking & Early bypassing algorithm into the DG• Try to use avalanche smartbits to do benchmark of the DG• Chapter 1 of the Thesis

(c) NotesLast week, I re-benchmark the latency of the DG with Early blocking & Early bypassing al

gorithm, but the throughput was too bad. Maybe I’ll get better results by using avalanche smartbits. However, using avalanche with smartbits at 604 is a troublesome business. This week, I’ll keep coding the modified Early blocking & Early bypassing algorithm into the DG and try to use free time to using avalanche smartbits to test the DG . Furthermore, I will spend much time in writing chapter one of the thesis.

12

2004/3/22~2004/3/26

(a) Works done last week• Implement modified Early blocking & Early bypassing algorithm into the DG• Benchmark the latency of the DG• Finish Chapter 1 of the thesis

(b) Works planned for this week • Chapter 2 of the Thesis• To analyze the code of the webfd

(c) NotesLast week, I implemented the modified Early blocking and Early bypassing algorithm into

the DG and finish writing chapter 1 of thesis. This week, I will spend time in writing chapter 2 of the thesis and to analyze the code of webfd.

13

2004/3/29~2004/4/2

(a) Works done last week• To analyze the code of the webfd

(b) Works planned for this week • Chapter 2 of the Thesis• To analyze the code of the webfd

(c) NotesLast week, I analyzed the codes of the webfd and found out a problem in the architecture.

This week, I must spend much time in writing chapter 2 of the thesis and use remainder time to analyze the code of webfd.

14

2004/4/5~2004/4/9

(a) Works done last week• To analyze the code of the webfd and DG

(b) Works planned for this week • Chapter 3 of the Thesis• Modify the code of webfd

(c) NotesLast week, I analyzed the codes of the webfd and modified some codes. This week, I will

try to finish modifying the code of webfd and writing chapter 3 of the thesis.

15

2004/4/12~2004/4/16

(a) Works done last week• To analyze the code of the webfd and DG• Finish chapter 3 of the thesis

(b) Works planned for this week • Chapter 4 of the thesis• Survey issues and solutions of the thesis

(c) NotesLast week, I wrote the chapter 3 of the thesis and had some problem. This week, I will sp

end more time to survey the algorithms or solutions and write the chapter 4.

16

2004/4/19~2004/4/23

(a) Works done last week• Chapter 4 of thesis

(b) Works planned for this week • Modify Chapter 1~4 of thesis• Chapter 5 of thesis

(c) NotesLast week, I wrote the chapter 4 of the thesis and discussed with Po-Ching. This week, I

will spend time to modify chapter 1 to 4 of thesis and write chapter 5 of thesis.

17

2004/4/26~2004/4/30

(a) Works done last week• Modified chapter 1~4 of thesis• Chapter 5 of thesis

(b) Works planned for this week • Modify Chapter 1~5 of thesis• Modify thesis slides

(c) NotesLast week, I wrote the chapter 5 of thesis and modified the code of webfd. This week, I wi

ll spend time to modify chapter 1 to 5 of thesis and slides.

18

2004/5/3~2004/5/7

(a) Works done last week• Modified Chapter 1~5 of thesis• Modified slides of thesis

(b) Works planned for this week • Modify Chapter 1~5 of thesis• Modify code of webfd

(c) NotesLast week, I modified slides and chapter 1 to 5 of thesis. This week, I will spend time to m

odify the codes of webfd.

19

2004/5/24~2004/5/28

(a) Works done last week• Benchmarking of webfd / DG

(b) Works planned for this week • Benchmarking of webfd / DG with snort

(c) NotesLast week, I did throughput benchmarking of webfd and DansGuardian. The throughput o

f webfd with all checking (URL and content keyword) is 12Mbps but the value is not highest point. Because we don’t have enough computers and we just use 8 computers to do the benchmarking. This week, I will continue to do benchmarking of webfd / DG with snort.

20

2004/5/10~2004/5/14