12: Pointers
UC Irvine - Fall ‘22 - ICS 45C
Quick list of things I want to talk about:
- Why?!?!?!?!
- How to use them?
- nullptr
- ptr++
- pass-by-pointer
- struct-like access
Expanded notes:
Pointers get a bad rap everywhere, so if you haven’t used them you might be somewhat scared about this lecture… but there is nothing to be worried! Pointers are just a slightly different type of variable.
Variables in memory
Before we actually talk about pointers, let’s recap variables. We talked about how different types have different sizes, and how they can store different ranges.
So when you declare a variable int my_var = 2;
, you can imagine that the computer has some memory reserved for my_var
:
In this example, we’re assuming that int
s take 4 bytes, and that the memory address the computer used for my_var
is 0x023
.
Note how my_var
is store in 0x023
, but the next available address is 0x027
, which is 4 bytes after1.
So this figure simply shows what happens if you create an int
variable and store a value in there.
We have a label called my_var
that we and the compiler know about, an address 0x023
that the computer knows about it, and the value can be accessed by both the name and the address.
Arrays in memory
Arrays are similarly stored in memory.
We talked a little bit about this in lecture, but let’s revisit!
Assuming we have an array of 3 int
s, this could be a possible visualization:
Our array, my_arr
, starts at the same position as before just for the sake of these examples, 0x023
.
Then, at that same address we have the value for the 1st index, 4 bytes later we have the 2nd index, then 4 more bytes later we have the 3rd index.
So what happens when you do something like my_arr[1]
:
you’re telling the computer to go to the memory address of the start of the array (in this case 0x023
), and then skip one int
element, so jump 4 bytes.
Doing that, the computer would go to address 0x027
to find the value of my_arr[1]
, which is 1033
in this example.
So from a functionality point of view, an array is just a pointer!
It points to an address in memory that stores a bunch of values of the same type, and the computer uses that address to find the element in index [i]
.
Pointers
Usually, this mapping from memory to values and memory to variables is done behind the scenes by the compiler and computer. However, sometimes you might want to actually manage things. That’s when pointers come in handy!
A pointer is simply a variable that stores an address.
So if you have an int
pointer p
, it means that you, the compiler, and the computer expect that the value inside of p
to point to an address in memory that stores an integer.
Let’s take a look at a first example:
#include <iostream>
using std::cout;
int main() {
int my_var;
int *my_ptr;
my_var = 2;
my_ptr = &my_var;
cout << "my_var: " << my_var << '\n';
cout << "my_ptr: " << my_ptr << '\n';
return 0;
}
First thing to note here, is that we added an asterisk between int
and my_ptr
.
To create a pointer, you do the same thing as regular variable, but add an *
– that indicates this is not a regular variable of this type, but a pointer of this type (e.g., int *
).
Then, the next thing to note is the operator we used: &
.
This can be used for logical and bit-wise and operations, as we’ve seen previously.
But when you add an &
before a variable, this returns you the address of that variable in memory.
So what we’re doing in this code is create an int
variable my_var
, and an int
pointer my_ptr
.
Then, we store 2 inside my_var
and we store the address of my_var
inside my_ptr
.
If you run this code, you should get something like this:
my_var: 2
my_ptr: 0x16dc8f4c8
my_var
should always be 2, but if you get that exact same memory address stored inside my_ptr
… maybe you should go play in the lottery :)
Just to keep going with our previous visualizations, let’s assume my_var
is in address 0x023
.
If that was the case, we could have a visualization like this:
The visualization above shows the same my_var
that we had seen before, which sits at address 0x023
and stores a value of 2
in there.
Now, we have my_ptr
right after it, using the address 0x027
and storing 0x023
– the address of my_var
!
Also interesting to note here is that our pointer took 8 bytes! Since pointers store memory addresses, the size of a pointer depends only in the architecture that your code runs. If your computer uses a 64-bit architecture, then your pointers should all take 8 bytes – regarless if it’s a pointer to a char, a short, an int, a long double, etc.
Pointer dereference
So now we know how to find the address of a variable, and how to store it somewhere. But that’s not too interesting, what can we even do with these addresses?
Once you have a pointer to an address, you can dereference it, which means that you’re navigating to that address and checking what’s stored.
That’s done with the *
operator.
Let’s see an example:
#include <iostream>
using std::cout;
int main() {
int my_var;
int *my_ptr;
my_var = 2;
my_ptr = &my_var;
cout << "my_var: " << my_var << '\n';
cout << "my_ptr: " << my_ptr << '\n';
cout << "*my_ptr: " << *my_ptr << '\n';
return 0;
}
This should give you something like:
my_var: 2
my_ptr: 0x16dc674c8
*my_ptr: 2
In my computer the address for my_var
has changed, but that’s fine since we just get the current address in the code with &my_var
.
But now we can see that *my_ptr
gives us the value 2
, which is the same as in my_var
.
Here’s a visualization of what’s going on:
When we use *my_ptr
, first we check what’s the address that we have stored inside my_ptr
.
Then, we actually go to that address, and then read the value it’s in there.
Keep in mind that the value is read based on the pointer type.
For example, if you have a char
pointer that actually points to an unsigned char
variable, you would read a 255
as -1
, since both have the same representation in bits.
We now have two variables that point to the same value:
my_var
directly since it’s at that address;- and,
my_ptr
indirectly, by storingmy_var
’s address and dereferencing it.
We can change the value using either access point and both would be updated:
#include <iostream>
using std::cout;
int main() {
int my_var;
int *my_ptr;
my_var = 2;
my_ptr = &my_var;
cout << "my_var: " << my_var << '\n';
cout << "*my_ptr: " << *my_ptr << '\n';
my_var = 123;
cout << "\nchanged my_var:\n\n";
cout << "my_var: " << my_var << '\n';
cout << "*my_ptr: " << *my_ptr << '\n';
*my_ptr = -10923;
cout << "\nchanged *my_ptr:\n\n";
cout << "my_var: " << my_var << '\n';
cout << "*my_ptr: " << *my_ptr << '\n';
return 0;
}
This should output:
my_var: 2
*my_ptr: 2
changed my_var:
my_var: 123
*my_ptr: 123
changed *my_ptr:
my_var: -10923
*my_ptr: -10923
By changing the value of my_var
, whenever we dereference my_ptr
we get the most up-to-date value.
And by changing the value inside the address that’s stored in my_ptr
, we change the value of my_var
since its address is what we have stored.
Changing addresses
Now, what would happen if you change the value of my_ptr
itself, instead of *my_ptr
?
You would just make it point to a different memory address, which you might know or not know what’s there.
For example, if you have an array, you can use my_ptr++
to advance to the next position.
Let’s check with an example:
#include <iostream>
using std::cout;
int main() {
int my_arr[3]{1, 10, 100};
int *my_ptr = my_arr;
cout << "my_ptr: " << my_ptr << '\n';
cout << "*my_ptr: " << *my_ptr << '\n';
my_ptr++;
cout << "my_ptr: " << my_ptr << '\n';
cout << "*my_ptr: " << *my_ptr << '\n';
my_ptr++;
cout << "my_ptr: " << my_ptr << '\n';
cout << "*my_ptr: " << *my_ptr << '\n';
return 0;
}
If you run this code, you should get something like this:
my_ptr: 0x16f1cf4b8
*my_ptr: 1
my_ptr: 0x16f1cf4bc
*my_ptr: 10
my_ptr: 0x16f1cf4c0
*my_ptr: 100
Again, the addresses might change and that’s fine!
But just notice how we started at some address XXb8
, and then advanced 4 bytes when we used ++
to XXbc
.
That jump of 4 bytes was because our pointer is an int *
, so it’s jumping to the next integer in memory.
If there were nothing there or some other value, we might get a weird reading from this. So it’s important to be very careful when manipulating pointers, otherwise you might end up in some weird/invalid memory position.
Member access
You can have pointers to any types, including custom defined ones.
If you have a pointer to a type with a member (e.g., struct
, string
) that you access with the .
operator, you should use the pointer operator instead ->
.
For example:
#include <iostream>
#include <string>
using std::cout;
using std::string;
int main() {
string my_str = "Hello!!";
string *my_ptr = &my_str;
cout << "my_str.length(): " << my_str.length() << '\n';
cout << "my_ptr->length(): " << my_ptr->length() << '\n';
return 0;
}
If you want to access the method/variable with the .
operator, you’d need to dereference it first:
cout << "(*my_ptr).length(): " << (*my_ptr).length() << '\n';
This code, with this last line above added at the end, should output:
my_str.length(): 7
my_ptr->length(): 7
(*my_ptr).length(): 7
Pass-by-pointer
Since a pointer is just a variable, you can also receive them in functions. This would allow you to change the values outside the function, since you have direct access to their addresses.
For example:
void double_value(int *my_param) {
*my_param *= 2;
}
This function receives a pointer as an argument and doubles the value inside the address that it points to. To call this function, we’d need to pass an address to it:
int my_var = 2;
double_value(&my_var);
cout << my_var; // should output 4
Receiving arrays as pointers
As we mentioned, an array is just a special pointer that points to the first element. So you can receive an array as a pointer in a function:
#include <iostream>
void print_first(int *arr) {
std::cout << arr[0];
}
int main() {
int my_array[3]{4, 5, 6};
print_first(my_array);
return 0;
}
Note how we don’t need to use &
here, since my_array
by itself already stores an address.
However, if you have a multi-dimensional array that you created statically, you won’t be able to receive it as a pointer.
Common Errors
Many people say they don’t like pointers, and that’s just because you could get some hard-to-debug errors since you might mess up the memory. So here are some common ones I’ve faced:
Pointing to nowhere
The main difference of receiving a pointer in a function and receiving a variable by reference is that your pointer could be NULL
.
If you have a null pointer and try to use it, you’d get undefined behavior.
It’s posible you’ll get an error, but your code could just misbehave.
An easy way to avoid this is adding checks for NULL
or std::nullptr
.
Using our previous function as an example:
void double_value(int *my_param) {
if (my_param != NULL) {
*my_param *= 2;
}
}
Now, our function would work fine even if the user called it like this:
double_value(NULL);
double_value(nullptr);
Pointing to unitialized memory
If you forget to initialize your pointer, you might have trash values in that position. Which means that you’ll be pointing to some random address in memory and might get unexpected behavior.
So just a reminder that, if you have a pointer, initialize it with some variable address like we did in previous examples, or nullptr
assuming you have checks like we just saw above.
Pointing to invalid memory
Lastly, similar to what we saw with arrays, we can point to anywhere! But, we might end up outside our memory space. If you point to some memory that’s not allowed in your code, you’ll probably get an error.
For example, running this:
#include <iostream>
using std::cout;
int main() {
int my_var = 2;
int *my_ptr = &my_var;
my_ptr += 1000000;
cout << *my_ptr;
return 0;
}
Gave me a segmentation fault:
[1] 83093 segmentation fault ./a.out
So be careful when changing the address manually! Make sure you’re moving only where you want, and if you get any bugs, check your pointers first to make sure the error is not there.
References
- Comic reference: https://xkcd.com/371/
-
Memory management might vary based on platform, OS, architecture, etc. However, we usually assume that memory increases in addresses (e.g.,
my_array[1]
is stored aftermy_array[0]
), and C++ makes the same assumption. If anything is different, the compiler should take care of that for you. ↩︎