C programming strxfrm string.h


In C programming the <string.h> strxfrm function transform two strings such that on using ‘strcmp’ to compare the two transformed strings,the result obtained will be same as the result given by comparing two strings with ‘strcoll’ on the original strings.I am sure you are wondering,what the heck does that mean,I know verbal description doesn’t do much good in explaining the concept of ‘strxfrm’,so let’s go to the code example,but before that let’s see the declaration of the function.

size_t strxfrm(char * restrict s1,
 const char * restrict s2,
 size_t n);

Parameters:
s1 -This pointer to array will store the transformed string.

s2 -The string which is to be transformed.

n -The number of characters placed in the ‘s1’ array after the transformation.

Return type
size_t -Returns the length of the transformed string(not including the terminating null character).If the value returned is ‘n’ or more,the contents of the array pointed to by ‘s1’ are indeterminate.

Note when the string is transformed a null-terminating character is automatically attached to the string.The expression given below is the size of the array needed to hold the transformed string.

1+ strxfrm(NULL, s, 0)

the plus 1 is for the null terminating character.

As said earlier when the two transformed strings are compared using ‘strcmp’ the result must be same as the result obtained by comparing the two original strings with ‘strcoll’.Now ‘strcoll’ function can compare a string specific to locale region or script,for instance if we compare two Russian strings with ‘strcoll’ the result obtained may be different from the result obtained by comparing the two Russian strings with ‘strcmp’;more about ‘strcmp’ and ‘strcoll’ functions can be found in the link given below.Consider the code example.

Link: C strcmp string.h
Link: C strcoll string.h

Code example

char *rs[] = { “Man”, “Cat” } ;

int ret ;

check=setlocale(LC_COLLATE , “.20880”); /*this function will set the program to compile in specific locale region,in our case it is set to Russian,the value 20880 also known as code page is the one that does this */

ret = strcoll(rs[0], rs[1]);

printf(“\nret=%d” , ret);

ret=strcmp( rs[0] , rs[1]);

printf(“\nret=%d”, ret);

Output,

ret=-1
ret=1

If you look at the output the returned value of ‘strcoll’ is ‘-1’ but that of ‘strcmp’ is ‘1’.Now this is because in Russian Alphabet the character ‘M’ comes before ‘C’ so ‘strcoll’ which compare the two strings using the Russian locale returns ‘-1’.But in case of ‘strcmp’ which is unaffected by the locale set,compare the two strings as though it were comparing an English Alphabet and we know in English Alphabet ‘M’ comes after ‘c’ so ‘1’ is returned.

In the above example we will now apply ‘strxfrm’ function to the two strings,such that the two transformed strings in comparing with ‘strcmp’ will yield the same result as comparing the original strings with ‘strcoll’ function.

Code example

const int size 1024;

char *rs[] = { “Man”, “Cat” } ,
trsfrm1[size] , trsfrm2[size] , ;

int ret ;

check=setlocale(LC_COLLATE , “.20880”); //set the locale to Russian

strxfrm(trsfrm1, rs[0], size) ; //transfrom rs[0]

strxfrm(trsfrm2, rs[1], size) ; //transform rs[1]

ret = strcoll(rs[0], rs[1]);

printf(“\nret=%d” , ret);

ret=strcmp( trsfrm1 , trsfrm2);

printf(“\nret=%d”, ret);

Output,

ret=-1
ret=-1

Look at the output,both ‘strcmp’ and ‘strcoll’ gives the same value i.e. -1.Note this transformation of string will works in almost all the locale Alphabets e.g. Spanish,France,Bulgarian.etc.

If you still can’t understand the concept of ‘strxfrm’ function comment below (don’t feel shy).


*Side Note

Using strxfrm( ) and strcmp( ) may be more efficient than ‘strcoll’ if you need to use the same string in many comparisons.


Related links

->C strncmp string.h

->C memcmp string.h