This is just a brief glossary to some basic programming concepts. It's not intended to be a substitute for a proper course of instruction or a basic text-book but may be useful to introduce any unfamiliar terms or ideas.
Data in computers is stored in specific locations in a computer's memory. A variable provides a way to access that information, and allows the data at that location to be changed. You can think of a variable as a handle to manipulate the information.
Variables can be strongly or weakly typed. Some languages such as C or Java require that a given variable can only hold one type of data which must be declared before the variable can be used, such as an integer or a floating point number. A Perl variable is weakly typed, that is to say a single variable can hold single characters, strings of characters or numbers of any sort, and you don't have to say beforehand what sort of data it will hold.
You can call a variable whatever you like, but its name must contain only numbers, letters or underscores, and the 1st character of a variable name usually can't be a number e.g.,
a1 = 5; a valid name 1a = 5; an invalid name
Loops allow the programmer to execute the same piece of code multiple times. There are two main constructs to achieve this: the for loop and the while loop.
The for loop is generally used when you know how many times you need to run the code. The general format is:
E.g., to sum the numbers 1-100:for (initial statement; test condition; iteration) { do something }
sum = 0; for (i =1; i<=100; i++) { sum += i; }
While loops tend to be used when the number of iterations is unknown. However use tends to be one of personal preference, as both methods accomplish the same goal. The general format is:
E.g.,to sum the numbers 1-100 again:initial statement while (test condition) { do something iterate }
sum = 0; i = 0; while (i <= 100) { sum += i; i++; }
Perl supports both types of loop construct.
For anything other than the most trivial program it is good design to compartmentalize your code into blocks that perform a specific function. If a program perfoms the same task many times it is clearly not ideal to have the same code repeated. Putting a frequently used block of code in a subroutine allows you to have that piece of code in one just place which aids debugging and code maintenance.
So, once you have a particular function working well, e.g., a piece of code that determines the length of a character string, putting it into a subroutine ( or method or function depending on what language you're using) means that the piece of code will now be in a "black box". That is, you pass it an argument, and get a result, but how it calculates that result is hidden from the rest of the program.
Here is an example where we make a subroutine that sums all integers up to and including its argument. The subroutine code is separated from the main program by curly braces. Subroutine naming is governed by the same rules as variable naming.
/*main program*/ sum = getsum(100); sum2 = getsum(300); /*subroutine declaration*/ sub getsum (arg) { sum = 0; while (i <= arg) { sum += i; i++; } return sum; }
Languages such as Java and C require that the data type of the argument and the data type of the returned result are both specified in the subroutine declaration. Perl however does not demand this, relying on documentation to communicate parameters and return types.
Assigning each individual piece of data a unique variable name has its limitations - we have to keep track of too many variables. An array is a consecutive set of memory locations used to store data. Any item in the array is accessed by its position (or index) in the array. Therefore we just need to keep track of an index and the name of the array to keep track of the data. E.g.,
stores 3 pieces of data consecutively. Note that array indices are always zero based - the first element in the array always has an index of 0. Arrays are ideally accessed by loops e.g.,data[0] = 0; data[1] = 1; data[2] = 2;
for (i = 0; i <=2; i++) { print data[i]; }
In many languages you have to set the maximum size of the array before you start using it, or allocate memory for the array dynamically. Perl expands and contracts arrays automatically behind the scenes.
Hashes, or associative arrays, are special types of arrays where instead of accessing the data by a numeric index, it is accessed by a unique character string, or key. Each key in a hash must be unique and has a 1-1 mapping to a variable value. The advantage of hashes is that you don't have to remember an array index, or search the entire array to look up one piece of data - you just need the key. So you get the advantages of being able to loop over many items as in an array, and the advantages of meaningful variable names.
The implementation of hashing algorithms can be complicated in some languages, but Perl provides full support for hashes, as described in the hashes part of the tutorial.
In programming, there are things, and there are pointers to things. If we think of a variable, that variable acts as a label for one piece of memory. A pointer or reference, however, is a special sort of variable that holds a memory address as its value. This means it can access data at different memory locations.
In the above diagram, the variable my_var holds the value 100, and resides at memory location 00ffe8. my_var_ptr is a pointer variable that holds the memory address of my_var. my_var_ptr can get or set the value held by my_var by dereferencing.
Pointers are extremely useful for manipulating large data structures such as arrays or hashes. If for example you wanted to pass an array as an argument to a subroutine, it would be very wasteful to copy each element of the array .It is much faster to pass a pointer to an existing array as an argument. In Perl, references are analagous to pointers in that they hold a memory address. See the references section and Complex data structures section of the tutorial for more details.