ITSE 1330 | C#

ACC

Variables & Types

Variables

No matter the type of job, we almost always need some form of storage. Chefs need their walk-ins and knife drawers. Mechanics need their toolboxes. Accountants need filing cabinets (physical and electronic). Software developers also need storage. Our running code needs storage. A variable is a location in memory that holds data of a specified type and that data can change. We will see below the different ways that value and reference variables are stored.

Variables are stored in the RAM (random access memory) of the computer. Variable values, the data stored at a location in memory, can change while the program is running. When the program exits, all references to variable data for that program that is still in RAM are lost. The memory is released back to the operating system for other uses.

How do we refer to or access the contents of a variable? Well, we could use the hexadecimal address of the variable's location in RAM, something like: 0x42E8. However, that would be onerous, time-consuming, and error prone. The solution: give variables more humanly recognizable names. For instance, in the last chapter, the Aircraft class used the words speed and altitude to identify that data. The words speed and altitude themselves are known as identifiers. They provide a convenient means to identify those variables. Internally, to the CLR, altitude might be equivalent to the memory address: 0x39D4. In other words, that is the location in memory where the value for altitude is stored.

Identifier Naming Rules

There are naming rules in C# enforced by the compiler and must be followed. Keywords such as int, double, new, enum, and static can only be used as keywords and cannot be identifiers. There is an exception to this rule. When interfacing with other programming languages, it can be useful to use keywords as identifiers. C# supports this behavior (known as a "verbatim identifier") by allowing an @ to prefix the keyword thus making it an identifier. For example, double is a keyword discussed below. It can be converted into an identifier by prepending the @ (@double). This technique is not used in this introductory course. The official keyword list can be viewed here.

As mentioned above, the name for variables are known as identifiers. The names of classes, objects, structures, and properties are also known as identifiers. C# limits identifiers to letters, numbers, and the underscore character. Furthermore, identifiers must begin with a letter or an underscore and cannot begin with a number. To summarize, the identifier naming rules are:

  • Identifiers are case sensitive (i.e. num1 is not the same as Num1)
  • Cannot be a keyword (unless using verbatim identifier)
  • Can only contain letters, numbers, and underscores
  • Cannot begin with a number
  • Cannot contain special characters (e.g. *#%@!)

Identifier Naming Conventions

Conventions, unlike rules, are suggestions or recommendations and are not enforced by the compiler. However, for readability, portability and team programming, conventions are usually followed by software development teams. There are two general categories for naming conventions, variables and all others.

Microsoft recommends that C# variables use lower camel casing which is also known simply as camel casing. With lower camel casing the first letter of the first word of the identifier is lower case and the first letter of each remaining word is upper case. For instance, cityTaxRate is an example variable identifier that uses lower camel casing. In a program used by a state organization, it is easy to see that the cityTaxRate would vary by city.

For most other identifiers such as namespaces, methods, classes, structures, constants, enums, and properties, Microsoft recommends using upper camel casing (a.k.a. Pascal Casing). With upper camel casing, the first letter of each word is upper case. For instance, StateTaxRate is an example constant identifier that uses upper camel casing. In a program used by a state organization, the tax rate for the state would likely be constant. Note that in many other languages such as Java, C, and C++, constants are all upper case. However, in C# the use of upper camel case for constants is the convention. The use of all upper case for an identifier is considered SCREAMING_CAPS and is discouraged.

The general .NET naming conventions can be viewed here and the C# naming conventions can be viewed here.

Value Types

In our programs, we work with different types of data. Recall that C# is a strongly-typed language which requires that the type of storage location (variable) must agree with the type of the data stored there. In other words, in a strongly-typed language, we receive compiler errors if we attempt to store a string (text data) in an integer variable. With that in mind, let's look at the table below which lists the built-in data types supported by C#.

The range numbers are approximate for some of the entries. For instance, 32K is actually 32,767. The "u" prepended to some types means "unsigned" which doubles the positive capacity for the same signed type. How is the size in bytes relevant? Well, a byte is 8 bits (8 1's and 0's). How many decimal numbers can be represented with two bytes? That is 16 bits, so 216 which is 64K. By the way, a K (kilobyte) is 1024 (210). Since a short is two bytes, a total of 64K numbers can be represented with a short. If it is signed, that is -32K - 32K and if it is unsigned that is 0 - 64K. Since an int type holds 4 bytes, that is 8 x 4=32, so 232=4B (or 4 Gigabytes). The two non-numeric data types in the table are char which is used to store one character and bool (short for boolean) which can either be true or false. Like in C and C++, with boolean variables, 0 equates to false and all other numbers equate to true. However, unlike in C and C++, the actual value stored in a C# boolean is true or false and NOT 1 or 0. There is a code example of this point in the Conversion project below.

Type Size in Bytes Range
byte
1
0 - 255
sbyte
1
-128 - 127
short
2
-32K- 32K
ushort
2
0 - 65K
int
4
-2B - 2B
uint
4
0 - 4B
long
8
-9x1018 - 9x1018
ulong
8
18x1018
float
4
-3.40xe38 - 3.40xe38
double
8
-1.79xe308 - 1.79xe308
decimal
16
-7.9xe28 - 7.9xe28
char
2
character
bool
1
True or False

Value types, unlike the reference and pointer types discussed below, contain the actual "values" of their variables. That is, if a decimal type of variable with the name price has been assigned the value of 1.99, then price contains 1.99. That is in contrast to reference and pointer types which do not contain "data" values but rather addresses.

When a number appears directly in code (e.g. 123) it is known as a numeric literal since it is "literally" supplied. The same concept applies to characters and strings. That is, 'a' is a character literal and "abc" is a string literal. Note that in C#, single quotes are used for single characters and double quotes are used for strings (more than one character). In some languages, like JavaScript, single and double quotes can be used interchangeably (as long as they are paired).

C# stores floating point numeric literals as doubles. On line 25 in the code snippet below, we are trying to assign the numeric literal (a double) to a decimal type which the compiler does not allow since decimal is a base-10 type and float is base-2. On line 26, double quotes are used for a character type. We fix both errors in the next project.

Start a new C# console project in VS with the code below. Name the project ValueVariables + "initials". This project demonstrates use of the C# variables and data types. Let's observe a few points in the code. Note the "F" and "M" appended to the literals on lines 22 and 25. The "F" and "M" explicitly cast (convert) the literals, which are of type double, to types float and decimal respectively. Notice also on line 24 that the double quotes were replaced with single quotes. Notice that lines 13-27 contain examples of the 13 value data types in the table above. When a primitive (a.k.a. simple) variable is created it is both declared and defined. Declaration involves assigning a name to the variable and the name is known as an identifier. Definition is the process of allocating memory. For example, on line 14 one byte of memory is allocated to a variable with the name mysByte. In addition to declaring and defining, each of the examples on lines 13-27 also includes the initialization step which assigns a value to the variable at the time of variable creation.

ValueVariables Program.cs

Start the application to see the output shown. Lines 31-43 above display the contents of the variables and their sizes in bytes. The "\t" places a tab in the output. The "{0,12}" instructs the console to allocate an alignment of size 12 to the first variable. The "0" means the first variable since those items are zero-based. For example, on line 31, the first variable in the output list is myByte and it will be allocated an alignment of size 12. The "{1,10}" instructs the console to allocate an alignment of size 10 to the second variable.

Output ValueVariables

The code above demonstrates value variables which are designed to store the actual value in the memory location of the variable. The variable myLong contains the value 888 and so on. Code the project and practice with the various settings.

Reference Types

Reference types are so named because they contain references (addresses) to variable data instead of the values themselves. C# has three variations of reference types: object, string, and dynamic.

The object type is a root-level type and is the same as the System.Object class. Variables of type object are generalists and can be assigned values from the following types: value, reference, and user-defined. In the code below, notice that myObject is being assigned a variety of types. The process is known as "boxing" when a value type is assigned to an object type. When an object type is converted back to a value type the process is known as "unboxing".

Boxing and unboxing are demonstrated in the example. On line 14 a variable named myObject of type object is created and initialized with the value of 500. The integer 500 is "boxed" into the object. Recall that an object type can be assigned a variety of types. On line 16, the value in myObject is needed for a multiplication operation (i.e. myObject x 2. The * is the multiplication operator). Since general objects do not support multiplication, myObject must be "cast" (converted) into an integer. The casting/conversion operation on line 16 is known as "unboxing".

Since strings represent text information, they are pervasive in almost all programming languages. In C#, string variables are considered to be immutable. Immutable is a computer science term which means cannot be changed. However, variables, by definition, can be modified. Instead of allowing modification to the string variable, an entirely new string variable with the original name is created. This provides the effect and illusion of changing the string value.

Most languages provide extensive string manipulation methods and C# is no exception. In C# there are string class methods to concatenate, modify contents, compare, split, and search. See the MSDN String Programming Guide for extensive coverage of C# string operations.

The third and last reference type in C# is dynamic. Objects of this type are very similar to the base object type since, like object, they are generalists and the same dynamic variable can be assigned values of various types. The difference between type object and type dynamic is that dynamic types bypass static type checking. This means that type checking for object types is performed by the C# compiler and for dynamic types it is performed by the CLR. We will not explore more details with dynamic variables since the topic is beyond the scope of an introductory course.

Start a new project named ReferenceVariables. From this point on in the course the + "initials" guidance is presumed. Furthermore, the initials are only included as a reminder that those projects, classes, executables are yours. You can drop the initials at any time if you are comfortable with differentiating between the code you are writing and other accessed or download projects.

Code the project as shown. By the way, you are encouraged to extend the code examples in any way desired. The code provided is just to get us started. On lines 14, 17, and 18 objects of the three reference types are created: object, string, and dynamic. On line 20 the "\n" is used to output a new line on the console. Lines 22-24 output the original contents of the object, string, and dynamic variables. In the next two sections from lines 28-50, the values and types contained in myObject and myDynamic are changed and output to the console.

ReferenceVariables Program.cs

The output of ReferenceVariable is shown . Notice the changing values and types of both myObject and myDynamic.

Output ReferenceVariables

Pointer Types

Pointers are variables that store the address of another variable. Pointers in C# have the same capabilities and behaviors as those in the C and C++ languages. Recall that C/C++ are unmanaged languages which means that memory management is the programmer's responsibility. Programmers must ensure that memory is freed (deleted) when no longer used. In C#, this role is normally performed by the garbage collector but not if pointers are used. When pointers are used in C#, which is not very common, C# becomes unmanaged (a.k.a. unsafe) and memory management is turned over to the programmer. Since developing "unsafe" code is not common in C#, the topic of pointers will not be reviewed in this course.

Type Casting

The next topic we will cover in this chapter is type casting which means converting from one type to another. We have already seen type casting in the unboxing example above in which an object was cast to an integer so multiplication could be performed. During a cast, a temporary copy of the value is created and the original remains unchanged. There are two general forms of casting, implicit and explicit.

With implicit casting, the compiler will attempt to convert from one type to another. In the code segment , all five implicit casts from lines 44-48 are successful since a variable with a smaller address space is being assigned to a variable with larger address allocation.

However, the C# compiler protects against "narrowing" conversions. That is, the compiler produces an error when attempting to convert from a larger data type to a smaller. See the code segment. The red wavy line under the variables means that the implicit conversion cannot be accomplished from a large to a smaller data type. Notice the error message highlighted that appears when the cursor hovers over myDouble. Also note the message that an explicit cast exists which we will get to shortly. See this table on the MSDN (Microsoft Developer Network) site for a full listing of implicit cast operations.

Explicit casting means that the programmer is instructing the compiler to perform the conversion even if moving from larger to smaller. In the code segment, all five explicit conversions are allowed since the narrowing protection of the compiler is being overridden by the programmer with the explicit cast which is the (byte) that precedes the variable. The (byte) is known as the cast operator.

Explicit cast operators exist for almost all value types with the notable exception of bool. Implicit conversion with bool is also not supported. It should be noted that data loss is possible with explicit conversions. Moving from a larger address space to a smaller space can naturally result in some information being lost. An OverFlowException (error) is thrown (produced) when the address bounds of the receiving variable are exceeded. More on exceptions in an upcoming chapter. See this table on the MSDN (Microsoft Developer Network) site for a full listing of explicit cast operators.

Start a new project named Casting and code as shown. There are two sections of interest. First, lines 21-32 demonstrate implicit "upcasting", that is, moving from a small memory location to a larger. Second, lines 42-53 demonstrate explicit casting in which we the programmers are forcing the C# compiler to move from a larger memory location to a smaller. Let's look at the output window.

Casting Program.cs

We can observe the effects of the Casting code in the output window. Notice in the implicit upcasting results that the original value in myByte of 255 is retained in its new locations each of which has sufficient memory to accommodate a byte data type.

However, the results are not so good with the explicit casts. The value in myShort of 32767 was changed to be stored in myByte as 255. The value in myInt of 333 was stored in myByte as 77. Both the myShort and myInt changes are the result of an OverFlowException. For example, since 256 is the number of values that one byte can support (from all 0's to all 1's): 333 - 256 = 77. The 333 value "overflowed" the 255 value to yield the result of 77.

The same overflow effect occurred with myInt and myLong. With myLong: 888 - (3 x 256) = 120. The 888 overflowed three times to yield a remainder of 120. myShort overflowed 127 times. Instead of saying "it overflowed that many times", it is more accurate to say that 888 is three times and 32767 is 127 times too large to be represented by one byte. Notice that the values after the decimal were lost (truncated) for both myFloat and myDouble.

Output Casting

Programmers should be very careful with explicit casting given the results we observed. It can be useful under certain circumstances but overflow and truncation must be considered.

Type Conversion Methods

The System.Convert class (which inherits directly from System.Object) contains more than one hundred methods which can be reviewed here. A few of the convert class methods are demonstrated in the Conversion project. The ToString() method is so common it is included in all value types. Lines 23, 25, and 27 demonstrate the ToString() method which converts the numerical values in the variables to strings. Notice that ToString() is called as a method of the variable (e.g. myByte.ToString()). In the output, the numbers appear as numbers but they are actually strings due to the conversion.

Lines 39 and 41 show examples of the Parse() method which takes a string for an argument ("12345" on line 39) and returns a number representation of that string. Note that Parse() is called from the type (in this case Int32) and not the variable. You can also see that the keyword "int" on line 42 also has the TryParse() method. That is because "int" is an alias or shorthand for the more formal "Int32". They point to the same type.

On lines 48, 50, 52, 54 the ToBoolean() conversion method of the Convert class is called. Like the Console class, the Convert class is static which means that an object of type Convert is not required in order to use methods of the class. Various overloaded forms of the the ToBoolean() method can be observed here. In the examples below, Int32 types are being passed. Unlike other languages such as C and C++, in C# (and in Java) the boolean values true and false do not have inherent numeric values. However, lines 48 and 54 produce false since the argument evaluates to zero. Lines 50 and 52 produce true since the arguments are non-zero. Again, in C# false does not equal zero. But, zero does explicitly convert to false. And, in C# true does not equal one as in C and C++. But, any non-zero integer type does explicitly convert to true. On lines 50 and 52, the arbitrary values 99 and -4368 were converted to true.

Conversion Program.cs

 

Output Conversion

What's Next?

In the next chapter, we consider the role of operators in C#.