Asked  6 Months ago    Answers:  5   Viewed   49 times

I am trying to understand pointers in C but I am currently confused with the following:

  • char *p = "hello"
    

    This is a char pointer pointing at the character array, starting at h.

  • char p[] = "hello"
    

    This is an array that stores hello.

What is the difference when I pass both these variables into this function?

void printSomething(char *p)
{
    printf("p: %s",p);
}

 Answers

83

char* and char[] are different types, but it's not immediately apparent in all cases. This is because arrays decay into pointers, meaning that if an expression of type char[] is provided where one of type char* is expected, the compiler automatically converts the array into a pointer to its first element.

Your example function printSomething expects a pointer, so if you try to pass an array to it like this:

char s[10] = "hello";
printSomething(s);

The compiler pretends that you wrote this:

char s[10] = "hello";
printSomething(&s[0]);
Tuesday, June 1, 2021
 
ojrac
answered 6 Months ago
56

The difference here is that

char *s = "Hello world";

will place "Hello world" in the read-only parts of the memory, and making s a pointer to that makes any writing operation on this memory illegal.

While doing:

char s[] = "Hello world";

puts the literal string in read-only memory and copies the string to newly allocated memory on the stack. Thus making

s[0] = 'J';

legal.

Tuesday, June 1, 2021
 
DilbertDave
answered 6 Months ago
56

According to the API docs, isLetter() returns true if the character has any of the following general category types: UPPERCASE_LETTER (Lu), LOWERCASE_LETTER (Ll), TITLECASE_LETTER (Lt), MODIFIER_LETTER (Lm), OTHER_LETTER (Lo). If we compare isAlphabetic(), it has the same but adds LETTER_NUMBER (Nl), and also any characters having Other_Alphabetic property.

What does this mean in practice? Every letter is alphabetic, but not every alphabetic is a letter - in Java 7 (which uses Unicode 6.0.0), there are 824 characters in the BMP which are alphabetic but not letters. Some examples include 0345 (a combiner used in polytonic Greek), Hebrew vowel points (niqqud) starting at 05B0, Arabic honorifics such as saw ("peace be upon him") at 0610, Arabic vowel points... the list goes on.

But basically, for English text, the distinction makes no difference. For some other languages, the distinction might make a difference, but it is hard to predict in advance what the difference might be in practice. If one has a choice, the best answer may be isLetter() - one can always change to permit additional characters in the future, but reducing the set of accepted characters might be harder.

Wednesday, July 28, 2021
 
DMTintner
answered 4 Months ago
88

Let me start off by saying something a little off topic:

  • I don't think this is a very good book. I think it confuses some topics to make them seem harder than they really are. For a better advanced C book, I would recommend Deep C Secrets by Peter van der Linden, and for a beginner's book, I'd recommend the original K & R

Anyway, it looks like you're looking at the extra credit exercises from this chapter.

  • Another aside- I don't think this is an especially sensible exercise for learning (another answer pointed out the question isn't formed to make sense), so this discussion is going to get a little complex. I would instead recommend the exercises from Chapter 5 of K & R.

First we need to understand that pointers are not the same as arrays. I've expanded on this in another answer here, and I'm going to borrow the same diagram from the C FAQ. Here's what's happening in memory when we declare an array or a pointer:

 char a[] = "hello";  // array

   +---+---+---+---+---+---+
a: | h | e | l | l | o | |
   +---+---+---+---+---+---+

 char *p = "world"; // pointer

   +-----+     +---+---+---+---+---+---+
p: |  *======> | w | o | r | l | d | |
   +-----+     +---+---+---+---+---+---+

So, in the code from the book, when we say:

int ages[] = {23, 43, 12, 89, 2};

We get:

      +----+----+----+----+---+
ages: | 23 | 43 | 12 | 89 | 2 |
      +----+----+----+----+---+

I'm going to use an illegal statement for the purpose of explanation - if we could have said:

int *ages = {23, 43, 12, 89, 2}; // The C grammar prohibits initialised array
                                 // declarations being assigned to pointers, 
                                 // but I'll get to that

It would have resulted in:

      +---+     +----+----+----+----+---+
ages: | *=====> | 23 | 43 | 12 | 89 | 2 |
      +---+     +----+----+----+----+---+

Both of these can be accessed the same way later on - the first element "23" can be accessed by ages[0], regardless of whether it's an array or a pointer. So far so good.

However, when we want to get the count we run in to problems. C doesn't know how big arrays are - it only knows how big (in bytes) the variables it knows about are. This means, with the array, you can work out the size by saying:

int count = sizeof(ages) / sizeof(int);

or, more safely:

int count = sizeof(ages) / sizeof(ages[0]);

In the array case, this says:

int count = the number of bytes in (an array of 6 integers) / 
                 the number of bytes in (an integer)

which correctly gives the length of the array. However, for the pointer case, it will read:

int count = the number of bytes in (**a pointer**) /
                 the number of bytes in (an integer)

which is almost certainly not the same as the length of the array. Where pointers to arrays are used, we need to use another method to work out how long the array is. In C, it is normal to either:

  • Remember how many elements there were:

    int *ages = {23, 43, 12, 89, 2}; // Remember you can't actually
                                     // assign like this, see below
    int ages_length = 5;
    for (i = 0 ; i < ages_length; i++) {
    
  • or, keep a sentinel value (that will never occur as an actual value in the array) to indicate the end of the array:

    int *ages = {23, 43, 12, 89, 2, -1}; // Remember you can't actually
                                         // assign like this, see below
    for (i = 0; ages[i] != -1; i++) {
    

    (this is how strings work, using the special NUL value '' to indicate the end of a string)


Now, remember that I said you can't actually write:

    int *ages = {23, 43, 12, 89, 2, -1}; // Illegal

This is because the compiler won't let you assign an implicit array to a pointer. If you REALLY want to, you can write:

    int *ages = (int *) (int []) {23, 43, 12, 89, 2, -1}; // Horrible style 

But don't, because it is extremely unpleasant to read. For the purposes of this exercise, I would probably write:

    int ages_array[] = {23, 43, 12, 89, 2, -1};
    int *ages_pointer = ages_array;

Note that the compiler is "decaying" the array name to a pointer to it's first element there - it's as if you had written:

    int ages_array[] = {23, 43, 12, 89, 2, -1};
    int *ages_pointer = &(ages_array[0]);

However - you can also dynamically allocate the arrays. For this example code, it will become quite wordy, but we can do it as a learning exercise. Instead of writing:

int ages[] = {23, 43, 12, 89, 2};

We could allocate the memory using malloc:

int *ages = malloc(sizeof(int) * 5); // create enough space for 5 integers
if (ages == NULL) { 
   /* we're out of memory, print an error and exit */ 
}
ages[0] = 23;
ages[1] = 43;
ages[2] = 12;
ages[3] = 89;
ages[4] = 2;

Note that we then need to free ages when we're done with the memory:

free(ages); 

Note also that there are a few ways to write the malloc call:

 int *ages = malloc(sizeof(int) * 5);

This is clearer to read for a beginner, but generally considered bad style because there are two places you need to change if you change the type of ages. Instead, you can write either of:

 int *ages = malloc(sizeof(ages[0]) * 5);
 int *ages = malloc(sizeof(*ages) * 5);

These statements are equivalent - which you choose is a matter of personal style. I prefer the first one.


One final thing - if we're changing the code over to use arrays, you might look at changing this:

int main(int argc, char *argv[]) {

But, you don't need to. The reason why is a little subtle. First, this declaration:

char *argv[]

says "there is an array of pointers-to-char called argv". However, the compiler treats arrays in function arguments as a pointer to the first element of the array, so if you write:

int main(int argc, char *argv[]) {

The compiler will actually see:

int main(int argc, char **argv)

This is also the reason that you can omit the length of the first dimension of a multidimensional array used as a function argument - the compiler won't see it.

Thursday, October 21, 2021
 
alez
answered 1 Month ago
67

Ok, first to understand this, it's important to know that const in C doesn't have to do anything with read-only memory. For C, there is no such thing as sections. const is merely a contract, it's expressing the intention that something is indeed constant. This means a compiler/linker can place data in a read-only section because the programmer assured it won't change. It doesn't have to, though.

Second, a string literal translates to a constant array of chars with 0 implicitly appended. See Peter Schneider's comment here: it is not formally const (so the compiler won't warn you when you take a non-const pointer to it), but it should be.

Combining this, the following code segfaults on my system with gcc on Linux amd64, because gcc indeed places the array in a read-only section:

#include <stdio.h>

const int myInts[] = {3, 6, 1, 2, 3, 8, 4, 1, 7, 2};

int main(void)
{
    printf("First element of array: %in", myInts[0]);    
    int *myIntsPtr = myInts;
    *myIntsPtr = *(myIntsPtr + 1);
    printf("First element of array: %in", myInts[0]);
    return 0;
}

Note there is also a compiler warning in the line where you take a non-const pointer to the const array.

Btw, the same code will work if you declare the array inside your function with gcc, that's because then, the array itself is created on the stack. Still you get the warning, the code is still wrong. It's a technical detail of how C is implemented here. The difference to a string literal is that it is an anonymous object (the char array doesn't have an identifier) and has static storage duration in any case.


edit to explain what a string literal does: The following codes are equivalent:

int main(void)
{
    const char *foo = "bar";
}

and

const char ihavenoname_1[] = {'b', 'a', 'r', 0};

int main(void)
{
    const char *foo = ihavenoname_1;
}

So, short story, if you want gcc to put data in a read-only section, declare it const with static storage duration (outside of a function). Other compilers might behave differently.

Saturday, November 27, 2021
 
ErocM
answered 2 Days ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 
Share