GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN...

25
GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn [email protected] 212-772-5327 1023 Hunter North CARSI Lab Teaching Assistant Tony Ierulli [email protected] 914-471-1526

Transcript of GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN...

Page 1: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

GTECH 731 Programming for Geographic Applications

Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN

Professor Sean [email protected] Hunter North CARSI Lab

Teaching AssistantTony [email protected] 914-471-1526

Page 2: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

GTECH 731 Programming for Geographic Applications

Texts

Required: Learning C# 3.0 (Paperback) by Jesse Liberty and Brian MacDonald.

Optional: Java Programming for Spatial Sciences by Jo Wood. Java is very similar to C#, but not similar enough for us to use this as the primary text. However, it may be useful to read in parallel with the Liberty text.

Other readings may be given out in the form of handouts.

Page 3: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

GTECH 731 Programming for Geographic Applications

Attendance and Exercises

Assignments: There will be short assignments almost weekly. It is very important to stay up-to-date with these, because each assignment will build directly on the last one. These will account for most of the grade.

Absences: Especially in the first half of the semester, any missed material will be problematic since each topic depends on the preceding topics.

Plagiarism: It is important to do your own work and work through the problems yourself.

Lab policies: Always delete your working files from you local machine and keep all your files on the network drives. Don’t install any software, and otherwise abide by the lab policies:

http://www.geography.hunter.cuny.edu/~tbw/spars/rules.html

Page 4: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

1. Programming Background

• Computer basics• Programs• Languages• Role of the operating system• C# program elements

Page 5: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

Computer Basics

The basic concept of a computer was first envisioned by Turing in 1936, when he described an abstract model of the modern computer:

Details From Wood, 2002.

Page 6: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

Computer Basics

Turing then devised the “Universal Turing Machine”, a related thought-experiment where the machine can run any other defined Turing machine.

This corresponds most closely to an actual computer, where any algorithm can be run on a single machine.

For an interesting and more in-depth discussion, of this topic see:

Martin Davis. Engines of Logic: Mathematicians and the Origin of the Computer. Chapter 7 "Turing Conceives of the all-purpose computer" Norton, 2001. ISBM 0393322297.

Page 7: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

Rather than using arbitrary symbols, computers represent everything as a zero or a one (a bit), usually grouped into multiples of 8 (a byte).

Most PCs now have a 32- or 64-bit architecture, which means that data is most often treated in units of four or eight bytes, for example:

0001 1000 1010 0111 0111 1111 0000 0000

What exactly these numbers mean depends on the context. They can represent:

An instruction in a program: ADD two numbersMOVE this information from this location in memory to another

An integer: 123,456-12

A floating-point number: 1.1234123,456.789

A location in the computer’s memory: The place where the text of the constitution is stored

Letters:ABCD

Etc., etc.

Computer Basics

Page 8: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

Program memory refers to the memory storing the actual program, which is normally loaded from disk into memory, and data memory, which is where information manipulated by the program is stored.

The processor treats these two kinds of information differently. Programs are a sequence of instructions executed by the processor; data is information altered and stored by the program.

Diagram from Hordeski, 1990.

Computer Basics

System diagram

Page 9: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

The processor contains registers which contain the information it is currently operating on, the current program location, and other critical information.

Registers are located in the heart of the CPU (central processing unit) and represent the fastest-working part of the system.

Information is then transferred from a cache, or short-term memory on the main chip, which in turn transfers information to and from main memory (RAM).

Information in RAM may then be passed on to disk (e.g., in file/save), or a network, printer, etc.

More information on registers...

Computer Basics

Processor diagram

Diagram from Hordeski, 1990.

Page 10: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

This results in a hierarchy of storage areas:

Registers and Cache are not usually managed by the programmer directly.

fastest

Registers Cache Memory (nanoseconds) Disk (milliseconds) Network Drives (eg, fileservers) (seconds) Slowest

Computer Basics

Memory hierarchy

Page 11: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

Programs

A program is a series of instructions that operate on data. The central processing unit reads instructions in sequence from memory, and executes them one by one, in a kind of loop:

Diagram from http://en.wikipedia.org/wiki/Image:CPU_block_diagram.svg

Page 12: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

Programs

Each instruction, or group of bits understood as a command by the processor, is loaded from memory by the processor, and results in a particular action being taken.

Rather than using the long binary number or machine code, a programmer can represent the instruction with a mnemonic.

Each kind of processor has its own Instruction Set, which means that these instructions are different for different chip makers. This is a large part of why, for example, code written for Motorola and Intel chips had such a hard time cooperating.

Example from http://www.compilers.net/paedia/assembly_language/index.htm.

Page 13: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

Programs

These mnemonics are the basis of assembly language. In assembly language, you have to explicitly deal with low-level details like registers and locations in memory, which allows you to write very efficient code. However, it is extremely time-consuming and impractical for most applications.

Usually programming is done in a higher-level language, which is automatically

translated into machine code. In a high-level language, very few words represent many

lines of assembly code.

Example from http://www.pcmag.com/encyclopedia_term/0,2542,t=compiler&i=40105,00.asp

Page 14: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

Languages

These high-level languages are all basically variations on replacement algorithms, or grammars, wherein rules are implemented which govern what system of replacements generates the final program.

There are many ways a language syntax can be described. The following is a sample of the kind of grammar diagram that can succinctly describe a statement. Each valid language statement can be replaced either by other language statements, or by entities that can be ultimately distilled to machine code.

Language grammars are a very large topic, but not really necessary when actually programming. See the relevant Wikipedia entry for more details.

Page 15: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

Languages follow lineages, where each language share’s characteristics of its predecessors. In this course we will be using C#, which is closely related to Java and descends from C and C++.

C is a very low-level, systems-oriented procedural language that made it easy for programmers to write code as economical as assembly language. It also made it easy to make mistakes and write buggy software.

C++ added some more advanced features, making it possible to write more high-level code, but it but left all the original problems of C in place. In some ways this made things worse by making them more complicated.

Java rectified most of these problems, but the Java system is geared toward cross-platform development, and is awkward when using system-specific features (like Windows user interfaces).

C# has the benefits of Java, but is closely linked to Microsoft’s .NET framework, which allows you to write fully functional Windows programs that use all of the features of the OS.

Languages

Lineages

Page 16: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

For a more up-to-date diagram, go to http://www.levenez.com/lang/history.html#05

Languages

Lineages

Page 17: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

There are a variety of ways of classifying languages. High-level and low-level is sometimes a useful distinction, although it can be misleading because you can, for example, write high-level functions using a low-level language, and some high-level languages fully support low-level functions.

Languages

High-level versus low-level

From Wood, 2002.

Page 18: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

In interpreted languages, the code is not compiled to machine code, but, when the program is run, the instructions are translated into system commands by a separate

program (the interpreter) on the fly.

Most scripting languages, like JavaScript, are interpreted. They are usually slower than compiled languages, but not always. The speed of a program depends on many factors, and whether it is interpreted may or may not be determinative.

Languages

Compiled versus interpreted

Diagram from http://web.cs.wpi.edu/~gpollice/cs544-f05/CourseNotes/maps/Class1/Compilervs.Interpreter.html

Page 19: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

Languages can be compiled or interpreted. In compiled languages, the code is compiled to machine code, and the operating system manages running and terminating the program:

Languages

Compiled versus interpreted

Diagram from http://web.cs.wpi.edu/~gpollice/cs544-f05/CourseNotes/maps/Class1/Compilervs.Interpreter.html

Page 20: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

Java, C#, and Visual Basic are examples of hybrid languages. In these cases the compiler generates an intermediary code which depends on a separate software infrastructure to run.

This allows for more flexibility because the intermediate code is not resolved to machine code, so is more independent of the particular platform it runs on. At the same time, it can be highly optimized because it is compiled.

In C# and .NET, the Intermediate Code is MSIL, or Microsoft Intermediate Language.

The Interpreter is the Just-In-Time Compiler, or JIT, which creates executable code from the MSIL on the fly.

Languages

Hybrid languages

Diagram from http://www.codeproject.com/KB/dotnet/clr.aspx?df=100&forumid=3272&exp=0&select=412238.

Page 21: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

Once a program is compiled, it is run by the operating system. The operating system is

responsible for allocating memory for the program, loading its first instructions into the processor, managing the process as it

runs, and cleaning up after it terminates.

It also provides the interfaces by which it communicates with devices which would otherwise require more specialized code.

Role of the Operating System

Running a program

Page 22: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

In modern operating systems, the OS is responsible for many functions that would otherwise require programmers to rewrite basic operations like drawing a letter on the screen.

One key role of the operating system is to launch and manage programs, which, when running, become processes that the operating system juggles.

Other functions include interacting with the user (managing the mouse, keyboard, and display), managing communication with peripheral devices (disk drives, networks, etc), displaying graphics, managing windows, and many other functions that once required specialized programs.

That means most of what most programs do is interact with the operating system.

The Microsoft .NET architecture provides a convenient way of accessing OS and network resources. Since we will be working in that environment, a large part of the code we write will involve interacting with .NET.

Role of the Operating System

Page 23: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

Here is a very simple C# program. It consists of components that we will get into in more detail in later classes. All of these elements are required in any C# program.

using System; namespace HelloNameSpace{

public class HelloWorld{

static void Main(string[] args){

Console.WriteLine("Hello World!");}

}}

Sample C# program

Page 24: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

The “USING” statement defines what part of the .Net framework (or other external components) will be incorporated into this programusing System; “NAMESPACE” says that any names (more on this later) created here are part of the given unit, not another (e.g., System). namespace HelloNameSpace

Curly braced define the beginning and end of any block of code. A block of code means different things in different contexts. Here, the HelloNameSpace consists of anything within the block. For clarity, blocks typically share the same level of indentation.{

Sample C# program

Page 25: GTECH 731 Programming for Geographic Applications Tuesdays 5:35 p.m. - 9.15 p.m. Room 1090B-HN Professor Sean Ahearn sahearn@hunter.cuny.edu 212-772-5327.

PUBLIC CLASS HELLOWORLD says this code unit, or class (much more on classes later), that we are calling HelloWorld, is available to any external code to use.

public class HelloWorld{

STATIC VOID MAIN. More on static and void later, but MAIN is a special function name that declares this as the starting point of the program. STRING[] ARGS is required for the starting point of the program, and contains any command-line parameters (for example, if we typed in HelloWorld “Banana” at the command line, args would contain “Banana”).

static void Main(string[] args){

SYSTEM.CONSOLE.WRITELINE(“HELLO WORLD!”); is telling the Console object within the system namespace to write the line “Hello World!”.

System.Console.WriteLine("Hello World!");}

}}

Main() and WriteLine() are functions, or ways of invoking code that take some action. HelloWorld and Sytem.Console are objects, or units of code that contain functions.

Sample C# program