**Workbooks and worksheets**

Excel works with files called **workbooks**. Each workbook contains one or more spreadsheets, called **worksheets**. Each worksheet consists of **cells** organized in a rectangular grid. The rows of a worksheet are labeled with a number and the columns are labeled with a letter or series of letters. The first row is labeled 1, the next 2 and so on. The first column is labeled A, the next B, etc. The column after Z is labeled AA, and then AB, AC, and so on. The column after AZ is BA, and then BB, BC, etc. The column after ZZ is AAA, and then AAB, AAC, etc.

**Figure 1 – Sample Excel Worksheet**

Each worksheet in Excel 2007/2010/2013/2016 can contain up to 1,048,578 rows and 16,384 columns (i.e. column A through XFD).

**Cell values, addresses and ranges**

You can enter any of the following into a cell:

- A number (e.g. 45 or -23.006)
- Text (e.g. London)
- A truth value (TRUE or FALSE)
- A formula (e.g. =SUM(A4:B7)/4)

If you want Excel to treat a number as text then you need to precede the number with a single quote – e.g. ‘4.5 is considered to be the text 4.5.

Each cell also has an **address**, consisting of a row and column label. E.g., in Figure 1 the cell with address A5 contains the text “London”, while the cell with address C6 contains the number 8.1.

You can also reference a **range** of cells. For our purposes we only consider rectangular ranges, which consist of a rectangular collection of cells. Such ranges are specified as two cell addresses separated by a colon. E.g. the range C5:C10 consists of the 5 cells from cell C5 to C10, which for Figure 1 corresponds to the data elements for Brand B. The range A4:D10 consists of all the cells in the rectangle whose opposite corners are A4 and D10, which for Figure 2.1 corresponds to all the data in the table, including row and column headings, but excluding totals.

A **cell reference** consists of an address of a single cell (e.g. G17 or AB8) or of a cell range (e.g. A1:D6 or ZZ1:AAB14). Cell references can also be named. E.g. you can highlight the range B5:D5 in the worksheet in Figure 1, right click and select **Name a Range …** to assign the name London to the range B5:D5.

The usual way of selecting a cell is to simply move the mouse pointer to that cell and left click. To select a range of cells, click on a cell in one of the four corners of that range and then highlight the remaining cells in the range using the mouse. If the cells you want to select are not visible you can use the horizontal and vertical scroll bars in the usual manner to make the desired cell range visible.

If the contents of a cell are too big for the space allocated to that cell you may see the contents displayed as #######. To properly see the real contents you may need to increase the width of the column containing this cell. E.g. suppose the cell B5 is not wide enough for its contents. You can increase the width of column B by moving the mouse pointer to the vertical line at the border between the header for column B and column C (the column to the right of B). Once it is in the correct position the mouse pointer changes shape; you now hold down the left mouse button and move the mouse pointer to the right to increase the column width (or left to decrease the column width). You can also change the width of a column by right clicking on the column heading and selecting the **Column Width…** option.

**Formulas**

As mentioned above, besides numbers and text, cells can contain **formulas**. Formulas are built up from the following components preceded by an equal symbol (=):

- Numbers
- Text
- Operators
- Cell references
- Worksheet functions

Operators include the following:

- Arithmetic operators: addition (+), subtraction (-), multiplication (*), division (/) and exponentiation (^)
- The concatenation operator: & (used to concatenate text)
- Logical comparison operators: less than (<), greater than (>), less than or equal (<=), greater than or equal (>=), equal (=) and not equal (<>)

The usual **precedence rules** apply, namely multiplication and division are applied before addition and subtraction, and exponentiation is applied before any of the other operators. Parentheses are used to change the order that operators are applied. For example:

4+5*2 = 14 (4+5)*2 = 18

3^2-1 = 8 3^(2-1) = 3

-(3^2)+1 = -8 (-3)^2+1 = 10

For some reason, Excel violates the usual precedence rules by giving a unary minus precedence over exponentiation. E.g. Excel calculates -3^2+1 as if it were (-3)^2+1 and so returns a value of 10 instead of -8. Similarly -1^2 is evaluated as 1 instead of -1. To get the correct answer you need to use the expression -(1^2).

Excel provides a variety of worksheet functions such SUM, MIN, LOG, etc. We will describe these in more detail shortly.

Each cell has a **value**. For cells containing a number, text or a truth value, the value of the cell is simply the contents of the cell. For cells containing a formula, the value is the evaluation of the formula based on the values of any referenced cells. Some examples of the values of formulas are given in Figure 2.

**Figure 2 – Examples of Excel Formulas**

If we look at the example in Figure 1, we notice that the cell B11 contains the formula =SUM(B5:B10). The value of this formula, and therefore the value of cell B11, is 106.1. This is the value that is actually displayed in the cell. The formula is displayed in the Formula Bar, just above the grid, to the right of the symbol ** f_{x}**.

Note too that if we had assigned the name BrandA to the range B5:B10, then the formula =SUM(BrandA) would be equivalent to =SUM(B5:B10) and would have the same value.

In specifying a formula such as =A1+3 you can simply type the formula character by character. Alternatively you can type the character = and click on the A1 cell and then continue by typing +3. Excel will automatically specify the correct formula. Similarly for a formula which contains a cell range such as =SUM(A1:B4) you can type =SUM( and highlight the cell range A1:B4 and then type the right parenthesis to complete the formula.

If a cell has an illegal value you will see an error value displayed in the cell. E.g., if you enter =A1/0 into cell B1 then cell B1 will have the value #DIV/0 indicating that you are attempting to divide by zero. These error values all start with the symbol # and include #DIV/0, #N/A, #VALUE, #NAME?, etc.

One of the things that gives Excel such power is that when you change the value of any cell then the value of all formulas that reference that cell also change. E.g., if you change the value of cell B5 in the worksheet in Figure 1 from 23.5 to 13.5 then the value of cell B11 will automatically change from 106.1 to 96.1.

You can copy the contents of any cell into another cell (or, in fact, any cell range into another cell range, usually of the same size and shape). If the content of the first cell is a number, text or truth value, then the second cell will contain the same number, text or truth value. When the contents of the first cell is a formula then the second cell will also contain a formula, but the specific formula depends on the type of addressing that is used.

**Relative and absolute addressing**

Excel provides two types of addressing when defining a cell or cell range: absolute and relative addressing. The default is relative addressing. For example, as mentioned above, the contents of cell B11 in the worksheet in Figure 1 is =SUM(B5:B10). Here B5:B10 is a relative address. If this formula is copied into cell D11, then D11 will contain the formula =SUM(D5:D10), which accomplishes the same function as the formula in cell B11, namely to sum the values in the 5 cells above, and so the value of cell D11 will be 43.3. The usual way of copying the contents of one cell into another is to click on the first cell (B11 in the example above) and press **Ctrl-C**, and then click on the second cell (D11 in the example above) and press **Ctrl-V**.

The dollar symbol $ is used to specify absolute addressing. When copying a formula which contains a dollar sign to another cell location, any part of an address which contains a $ does not change. E.g., suppose cell B11 in the worksheet from Figure 1 contains =SUM($B$5:$B$10). Its value is still the same, namely 106.1, but this time if we copy the contains of cell B11 into D11, then D11 will also contain the formula =SUM($B$5:$B$10), and so the value of cell D11 will be 106.1 and not 43.3.

Note that we can use absolute addressing on any part of a cell address, namely, B11 (no absolute addressing), B$11 (absolute addressing for the row, but relative addressing for the column), $B11 (absolute addressing for the column, but relative addressing for the row) and $B$11 (absolute addressing for both row and column).

To change from relative addressing (the default) to absolute addressing, you simply put the $ in the appropriate place. Alternatively you can highlight the cell address in the Formula Bar or in the cell and press the function key **F4**. E.g. if you highlight the cell reference A3 and press the F4 key, then A3 changes to $A$3. If you press F4 again it changes to A$3. Pressing F4 again changes the reference to $A3, and finally pressing F4 one more time returns the cell address to the original reference of A3.

You can also reference cells in another worksheet. To do this you need to precede the reference (absolute or relative) by the name of the worksheet followed by !. The default names for worksheets in Excel are Sheet1, Sheet2, etc., although these names can be changed as explained below. Thus the (relative) reference to range A3:B4 in Sheet1 is Sheet1!A3:B4. You don’t really need to worry about all of this since by clicking on a cell in another worksheet, Excel automatically copies the correct reference address into the formula.

I found the section on precedence of exponentiation vs negative numbers confusingly worded. Perhaps replace “Another example of the precedence rules is” with “In conventional mathematical notation, the precedence rules are” or similar.

Excel works as it does because the “unary minus” operator (a minus used in front of a string of digits and (possibly) a decimal point and/or exponent to denote a negative number) has higher precedence (in Excel) than exponentiation. This is the same in FORTRAN (an early programming language for numerical applications) — I expect that Excel followed that language’s precedent.

Konrad, Thanks for this explanation. I now understand the situation better.

Charles

Hi Zaiontz,

I find the all the slides are very useful. But I face difficulties to use Logistic Regression by using Excel. When I add add-ins in excel the message box showing ‘Cann’t find project or library’ than it is asking RealStats Password. Could you please advise me that how can solve this problem.

I am looking forward to hear from you soon.

Kind regards,

Leo V. Dewri

Senior Lecturer

East West University

Leo,

Usually when someone gets an error message asking for the password, it means that they are trying to open the file they downloaded from the website. You should never try to open the file, nor do you need to to. You need to follow the directions on the webpage from where you performed the download and follow the installation instructions (starting with Alt-TI). The installation should be quick and easy.

Charles

Mr. Zaiontz,

After altering my data and playing around with your package I have discovered my error. It turns out to be a combination of not fully understanding your MLogit functions and having data too sparse. I collapsed my data into a smaller number of categories. Then I realized that I had misunderstood the meaning of the “h” argument to the MLogitParam function by starting the count at zero rather than one.

Though I haven’t put it to real use yet, your package appears very well thought out.

Regards,

Kevin Kilty

Kevin,

I am very pleased that you were able to resolve the problem. I am also happy that you like the package.

Charles

Mr. Zaiontz,

Thanks, first, for making this add-in for excel available. I have been playing with it, using it with some of your example data, which I calculate successfully. Last night I tried it on some data of my own that is interesting–to me at least. There are two independent variables, and a dependent variable that can take on 6 values. Thus I have been using your MLogit group of functions. My data are very sparse. Though I have 60 observations, I have not one replication of any treatment. When I attempt to retrieve coefficients using MLogitParams() I obtain #VALUE in every cell. Is this related to the sparseness of my data, or does it indicate some other mistake on my part?

Thanks for your reply, and thanks again for producing this very nice package.

Kevin Kilty