Monday 1 December 2008

Review: pointer, reference in c++

A Reference Guide
Last updated Jan 1, 2004.from: http://www.informit.com/guides/content.aspx?g=cplusplus&seqNum=166

Unlike most other programming languages, C++ offers an unusual diversity of argument passing mechanisms: pass by value, pass by address, and pass by reference. Additional non-standard argument-passing mechanisms are also supported by certain compilers, such as passing arguments on a register. Today I will explain the semantics of the three standard argument passing mechanisms, and focus on reference variables.

A bit of History
Most high level programming languages offer only a single argument passing mechanism. In Pascal and Ada, for example, arguments are passed by reference and that's it. In Java, objects are passed by reference, whereas fundamental types such as int and boolean are passed by value.

The crucial point, however, is that programmers cannot intervene or control the actual argument passing mechanism that the language applies. C is different in this regard. Built on the foundations of BCPL and B, which were really nothing more than high-level assembly languages, it offered two argument passing mechanisms: pass-by-value and pass by address. Oddly enough, it didn't offer the more common notion of pass by reference.

Pass by Value
Pass by Value is the default argument passing mechanism of both C and C++. When a caller passes by value an argument to a function (known as the callee), the latter gets a copy of the original argument. This copy remains in scope until the function returns and is destroyed immediately afterwards. Consequently, a function that takes value-arguments cannot change them, because the changes will apply only to local copies, not the actual caller's variables. For example:

void negate(int n) //buggy version
{
n=-n;//affects a local copy of n
}
void func()
{
int m=10;
negate(m);//doesn't change the value of m
std::cout<<m<<std::endl; //diplay: 10
}
When negate() is called, C++ creates a fresh copy of m on its stack. negate() then modifies this private copy, which is destroyed immediately when it returns. The original variable m remains intact, though. Consequently, the cout expression displays 10 rather than -10. If you want the callee to modify its arguments, you must override the default passing mechanism.

Pass by Address
In C, the only way to achieve this is by passing the argument's address to the callee. This passing mechanism is traditionally called pass by address.

In the literature, this often called pass by reference too, although it's is a misnomer -- C++ uses this term to denote a radically different passing mechanism. For example,

void negate(int * pn)
{
*n=-*n;
Similarly, the caller must also be adjusted:

void func()
{
int m=10;
negate(&m);//pass m's address<
std::cout<<m<<std::endl; //display: -10
}
Pass by Reference
This is all well and good. The problem with this technique is that it's tedious and error-prone. My impression is statistically, most C/C++ function calls don't use pass by value so forcing programmers to override the default passing mechanism using the &, * and -> operators is an example of a bad language design choice. C++ creators were aware of this. They introduced a new type of argument passing, namely pass by reference. The addition of reference variables and arguments to C++ was only a means of fixing an historical accident made in C about a decade earlier rather than a genuine innovation. However, the introduction of references did affect fundamental programming idioms in C++.

Most textbooks will tell you that a reference is "an alias for an existing object." For example,

int m=0;
int &ref=m;
The reference ref serves as an alias for the variable m. Thus, any change applied to m is reflected in ref and vice versa:

++m;
std::cout<<ref<<std::endl;// output 1
++ref; //increment m
std::cout<<m<<std::endl;// output 2
In other words, ref and m behave as distinct names of the same object. In fact, you may define an infinite number of references to the same object:

int & ref2=ref;
int & ref3=m;
//..and so on
Here ref2 and ref3 are aliases of m, too. Notice that there's no such thing as a "reference to a reference;" the variable ref2 is an alias of m, not ref.

Usage
In most cases, references are used as a means of passing arguments to a function by reference. The nice thing about references is that they function as pointers from a compiler's point of view, although syntactically they behave like ordinary variables. They enable a callee to alter its arguments without forcing programmers to use the unwieldy *, & and -> notation:

void negate(int & n)
{
>n=-n;//modifies caller's argument
}
void func()
{
int m=10;
negate(m);//pass by reference
std::cout<<m<<std::endl; //diplay: -10
}
Advantages of Pass by Reference
Passing by reference combines the benefits of passing by address and passing by value. It's efficient, just like passing by address because the callee doesn't get a copy of the original value but rather an alias thereof (under the hood, all compilers substitute reference arguments with ordinary pointers). In addition, it offers a more intuitive syntax and requires less keystrokes from the programmer. Finally, references are usually safer than ordinary pointers because they are always bound to a valid object -- C++ doesn't have null references so you don't need to check whether a reference argument is null before examining its value or assigning to it.

Passing objects by reference is usually more efficient than passing them by value because no large chunks of memory are being copied and constructor and destructor calls are performed in this case. However, this argument passing mechanism enables a function to modify its argument even if it's not supposed to. To avert this, declare all read-only parameters as const and pass them by reference. This way, the callee will not be able to modify them:

void display(const Shape & s)
{
s.draw();
}
Summary
The introduction of references to C++ initially caused a bit of confusion, especially among ex-C programmers who weren't sure which passing mechanism to use when (and in those days, virtually all C++ programmers were former C programmers). Even illustrious gurus offered to use the following dichotomy: when a function modifies its arguments, pass them by address and when it doesn't, pass them as references to const.

Today, using bare pointers is a thing of the past. In most cases, C++ offers superior alternatives that are both safer and cleaner. Similarly, when you decide how to pass arguments to a function, use references by default (except when you truly need to pass them by value) and avoid passing by address, unless there's a compelling reason to do so. If the function in question shouldn't modify its arguments, they should be declared const and passed by reference; the lack of the const qualifier indicates that the function is allowed to modify an argument. When should you pass arguments by address? Only when the function deals with real pointers. For instance, a function that allocates raw storage, or when a null pointer is a valid option. Otherwise, use references.

No comments:

My photo
London, United Kingdom
twitter.com/zhengxin

Facebook & Twitter