Development of computers 1970-2009

1
Corpora: From magnetic tape to web access Knut Hofland, [email protected] , icame.uib.no/history/poster.ppt UNIFOB Aksis, Bergen, Norway Development of computers 1970- 2009 1970s Mainframe computers: Univac, IBM, ICL 1971 Floppy disk (diskette) 1975 Altair 8800 Personal computer 1976 Apple I 1977 Apple II 1978 VisiCalc, spreadsheet 1979 WordStar, word processing software 1980 Seagate 5.25” 5 MB hard disk 1981 IBM PC (4.77 MHz, 16/64 kB RAM, 160 kB 5.25” diskette, MS-DOS, CGA) 1982 Commodore 64 1983 IBM PC XT (128 kB RAM, 10 MB HD, 360 kB diskette) 1983 Apple Lisa, first GUI interface 1984 Apple Macintosh (128 kB, 400 kB 3.5” diskette) 1984 First HP Laserprinter (Apple LaserWriter PS 1985) 1984 IBM PC AT (286 6-10 MHz, 20 MB HD, 256kB RAM, 1.2 MB diskette, EGA) 1984 MS/DOS 3.1 1985 Windows 1 1985 Philips CM-100 CD-ROM (Apple 1988) 1987 PS/2 (386 8-20 MHz, 640 kB RAM, 1,44 MB 3.5”, 20-70 MB HD, VGA) 1990 World Wide Web, text version 1990 Typical PC: 486 25 MHz, 4 MB RAM, 150 MB HD 1992 Windows 3.1 1993 Mosaic graphic web client 1994 MS/DOS 6.0 1995 Windows 95 1997 Typical PC: Pentium II 233 MHz, 64 MB RAM, 4 GB disk 2001 Windows XP 2007 Windows Vista 2009 Portable PC: Dual Core 2.2 GHz, 4 GB RAM, 400 GB HD 2009 Desktop PC: Quad Core 2.6 GHz, 16 GB RAM, 1000 GB HD 1980 = 5 MB, 2009 = 1000 000 MB Brown Corpus was made from 1961-64 12.02.1977: ICAME founded in Oslo 29-30.03.1979: First ICAME conference in Bergen 1977-79: ICAME News started March 1978 Converted the Brown Corpus from original punched card format to a more readable format and corrected errors found during the tagging of the corpus (from 1971-78). **R**T *THE *FULTON *COUNTY *GRAND *JURY SAID *FRIDAY AN INVESTIGATION 0010E1A01 OF *ATLANTA**AS RECENT PRIMARY ELECTION PRODUCED **QNO EVIDENCE**U TH 0020E1A01 AT ANY IRREGULARITIES TOOK PLACE. **R**T *THE JURY FURTHER SAID IN TER 0030E1A01 M-END PRESENTMENTS THAT THE *CITY *EXECUTIVE *COMMITTEE, WHICH HAD OVE 0040E1A01 R-ALL CHARGE OF THE ELECTION, **QDESERVES THE PRAISE AND THANKS OF THE 0050E1A01 *CITY OF *ATLANTA**U FOR THE MANNER IN WHICH THE ELECTION WAS CONDUCT 0060E1A01 ED. **R**T *THE *SEPTEMBER-*OCTOBER TERM JURY HAD BEEN CHARGED BY *FUL 0070E1A01 TON *SUPERIOR *COURT *JUDGE *DURWOOD *PYE TO INVESTIGATE REPORTS OF PO 0080E1A01 SSIBLE **QIRREGULARITIES**U IN THE HARD-FOUGHT PRIMARY WHICH WAS WON B 0090E1A01 Y *MAYOR-NOMINATE *IVAN *ALLEN *JR**.. **R**T **Q*ONLY A RELATIVE HAND 0100E1A01 A01 0010 1 The Fulton County Grand Jury said Friday an investigation A01 0020 1 of Atlanta's recent primary election produced "no evidence" A01 0020 9 that any irregularities took place. A01 0030 5 The jury further said in term-end presentments that A01 0040 3 the City Executive Committee, which had over-all charge A01 0050 2 of the election, "deserves the praise and thanks of A01 0050 11 the City of Atlanta" for the manner in which the election A01 0060 11 was conducted. A01 0070 1 The September-October term jury had been charged A01 0070 9 by Fulton Superior Court Judge Durwood Pye to investigate A01 0080 8 reports of possible "irregularities" in the hard-fought A01 0090 6 primary which was won by Mayor-nominate Ivan Allen A01 0100 5 Jr&. LOB Corpus was finished in Oslo/Bergen in 1979. Concordances were made to both Brown and LOB Corpus. The texts and concordances were distributed on magnetic tape and microfiche. One fiche = 207 pages (each with 72 lines with 132 columns). The LOB concordance contained frequency counts from the Brown Corpus. The LOB KWIC used 100 fiches. London-Lund corpus was distributed on tape. 1981: London-Lund KWIC concordance available on tape. 1982-1985: POS-tagging of LOB in Lancaster and Bergen (CLAWS1, Constituent Likelihood Automatic Word-tagging System). Word list and suffix list for look- up were based on the tagged Brown Corpus. Text and concordance available on tape. 1987: Melbourne-Surrey Corpus available (100K word newspaper text). ICAME News -> Journal. A version of Brown Corpus indexed by the MS-DOS program WordCruncher was made by Randall L. Jones from Brigham Young University (11 MB including index files). The index was so efficient that the program could be used on a standard IBM PC XT/AT. Distribution on diskettes started. Kolhapur Corpus (Indian English) and Lancaster Spoken English corpus were added to the collection. A mail-based infoserver was started (FAFSRV at NOBERGEN, EARN/BITNET). 1990: Polytechnic of Wales Corpus. 1992: Lancaster Parsed Corpus, Corpora list started. FTP info-server. Gopher server in 1993. ICAME CD-ROM collection, version 1. Contained Brown, LOB, Kolhapur, London_Lund and Helsinki Corpora, all indexed by WordCruncher. Macintosh/Unix version of the texts. Texts also indexed by MS-DOS program TA CT. WordCruncher logo 1995: Newdigate newsletters, ICAME web-site, 900 members on Corpora list 2000: ICAME CD-ROM, version 2, COLT CD-ROM with sound files, Internet search for holders of the CD-ROM to the main corpora. 2009: More than 3000 members on the Corpora list. Some statistics Content of ICAME CD, version 2: Moores law: transistor count doubling every two year Future: More material, new CD/DVD More corpora searchable on Internet Part of CLARIN (www.clarin.eu)

description

Corpora: From magnetic tape to web access Knut Hofland, [email protected] , icame.uib.no/history/poster.ppt UNIFOB Aksis, Bergen, Norway. Brown Corpus was made from 1961-64 12.02.1977: ICAME founded in Oslo 29-30.03.1979: First ICAME conference in Bergen 1977-79: ICAME News started March 1978 - PowerPoint PPT Presentation

Transcript of Development of computers 1970-2009

Page 1: Development of computers 1970-2009

Corpora:From magnetic tape to web access

Knut Hofland, [email protected], icame.uib.no/history/poster.ppt

UNIFOB Aksis, Bergen, Norway

Development of computers 1970-2009

1970s Mainframe computers: Univac, IBM, ICL1971 Floppy disk (diskette)1975 Altair 8800 Personal computer1976 Apple I1977 Apple II1978 VisiCalc, spreadsheet1979 WordStar, word processing software1980 Seagate 5.25” 5 MB hard disk1981 IBM PC (4.77 MHz, 16/64 kB RAM, 160 kB 5.25” diskette, MS-DOS, CGA)1982 Commodore 641983 IBM PC XT (128 kB RAM, 10 MB HD, 360 kB diskette)1983 Apple Lisa, first GUI interface1984 Apple Macintosh (128 kB, 400 kB 3.5” diskette)1984 First HP Laserprinter (Apple LaserWriter PS 1985)1984 IBM PC AT (286 6-10 MHz, 20 MB HD, 256kB RAM, 1.2 MB diskette, EGA)1984 MS/DOS 3.11985 Windows 11985 Philips CM-100 CD-ROM (Apple 1988)1987 PS/2 (386 8-20 MHz, 640 kB RAM, 1,44 MB 3.5”, 20-70 MB HD, VGA)1990 World Wide Web, text version1990 Typical PC: 486 25 MHz, 4 MB RAM, 150 MB HD1992 Windows 3.11993 Mosaic graphic web client1994 MS/DOS 6.01995 Windows 951997 Typical PC: Pentium II 233 MHz, 64 MB RAM, 4 GB disk2001 Windows XP2007 Windows Vista2009 Portable PC: Dual Core 2.2 GHz, 4 GB RAM, 400 GB HD2009 Desktop PC: Quad Core 2.6 GHz, 16 GB RAM, 1000 GB HD

1980 = 5 MB, 2009 = 1000 000 MB

Brown Corpus was made from 1961-64

12.02.1977: ICAME founded in Oslo

29-30.03.1979: First ICAME conference in Bergen

1977-79: ICAME News started March 1978

Converted the Brown Corpus from original punched card format to a more readable format and corrected errors found during the tagging of the corpus (from 1971-78).

**R**T *THE *FULTON *COUNTY *GRAND *JURY SAID *FRIDAY AN INVESTIGATION 0010E1A01 OF *ATLANTA**AS RECENT PRIMARY ELECTION PRODUCED **QNO EVIDENCE**U TH 0020E1A01AT ANY IRREGULARITIES TOOK PLACE. **R**T *THE JURY FURTHER SAID IN TER 0030E1A01M-END PRESENTMENTS THAT THE *CITY *EXECUTIVE *COMMITTEE, WHICH HAD OVE 0040E1A01R-ALL CHARGE OF THE ELECTION, **QDESERVES THE PRAISE AND THANKS OF THE 0050E1A01 *CITY OF *ATLANTA**U FOR THE MANNER IN WHICH THE ELECTION WAS CONDUCT 0060E1A01ED. **R**T *THE *SEPTEMBER-*OCTOBER TERM JURY HAD BEEN CHARGED BY *FUL 0070E1A01TON *SUPERIOR *COURT *JUDGE *DURWOOD *PYE TO INVESTIGATE REPORTS OF PO 0080E1A01SSIBLE **QIRREGULARITIES**U IN THE HARD-FOUGHT PRIMARY WHICH WAS WON B 0090E1A01Y *MAYOR-NOMINATE *IVAN *ALLEN *JR**.. **R**T **Q*ONLY A RELATIVE HAND 0100E1A01

A01 0010 1 The Fulton County Grand Jury said Friday an investigationA01 0020 1 of Atlanta's recent primary election produced "no evidence" A01 0020 9 that any irregularities took place. A01 0030 5 The jury further said in term-end presentments that A01 0040 3 the City Executive Committee, which had over-all charge A01 0050 2 of the election, "deserves the praise and thanks of A01 0050 11 the City of Atlanta" for the manner in which the electionA01 0060 11 was conducted. A01 0070 1 The September-October term jury had been charged A01 0070 9 by Fulton Superior Court Judge Durwood Pye to investigateA01 0080 8 reports of possible "irregularities" in the hard-fought A01 0090 6 primary which was won by Mayor-nominate Ivan Allen A01 0100 5 Jr&.

LOB Corpus was finished in Oslo/Bergen in 1979. Concordances were made to both Brown and LOB Corpus. The texts and concordances were distributed on magnetic tape and microfiche. One fiche = 207 pages (each with 72 lines with 132 columns). The LOB concordance contained frequency counts from the Brown Corpus. The LOB KWIC used 100 fiches.

London-Lund corpus was distributed on tape.

1981: London-Lund KWIC concordance available on tape.

1982-1985: POS-tagging of LOB in Lancaster and Bergen (CLAWS1, Constituent Likelihood Automatic Word-tagging System). Word list and suffix list for look-up were based on the tagged Brown Corpus. Text and concordance available on tape.

1987: Melbourne-Surrey Corpus available (100K word newspaper text). ICAME News -> Journal. A version of Brown Corpus indexed by the MS-DOS program WordCruncher was made by Randall L. Jones from Brigham Young University (11 MB including index files). The index was so efficient that the program could be used on a standard IBM PC XT/AT. Distribution on diskettes started. Kolhapur Corpus (Indian English) and Lancaster Spoken English corpus were added to the collection. A mail-based infoserver was started (FAFSRV at NOBERGEN, EARN/BITNET).

1990: Polytechnic of Wales Corpus.

1992: Lancaster Parsed Corpus, Corpora list started. FTP info-server. Gopher server in 1993.

ICAME CD-ROM collection, version 1. Contained Brown, LOB, Kolhapur, London_Lund and Helsinki Corpora, all indexed by WordCruncher. Macintosh/Unix version of the texts. Texts also indexed by MS-DOS program TA

CT.

WordCruncher logo

1995: Newdigate newsletters, ICAME web-site, 900 members on Corpora list

2000: ICAME CD-ROM, version 2, COLT CD-ROM with sound files, Internet search for holders of the CD-ROM to the main corpora.

2009: More than 3000 members on the Corpora list.Some statistics

Content of ICAME CD, version 2:

Moores law: transistor count doubling every two year

Future:

More material, new CD/DVD

More corpora searchable on Internet

Part of CLARIN (www.clarin.eu)