Zhou Ligong teaches you to learn the programming structure: memory alignment and basic data types

The second chapter is programming technology, this article is 2.2.1 memory alignment and 2.2.2 basic data types .

We know that arrays and pointers are collections of ordered data of the same type, but in many cases it is necessary to bundle different types of data together as a whole to make programming easier. In the C language, such a set of data is called a structure.

> > > 2.2.1 Memory alignment

Although all variables are eventually saved to memory at a specific address, the corresponding memory space must meet the memory alignment requirements. Mainly for two reasons:

Platform reasons: Not all hardware platforms (especially the low-end microprocessors used in embedded systems) can access any data at any address. Some hardware platforms can only access aligned addresses, otherwise hardware exceptions will occur.

Performance reasons: If the data is stored in an unaligned memory space, the processor needs to make two memory accesses when accessing the variable, and the aligned memory access requires only one access.

In a 32-bit microprocessor, the processor accesses the memory in 32 bits, that is, one read or write is 4 bytes. For example, the address 0x0 ~ 0xF is 16 bytes of memory for micro processing. For the sake of the device, instead of treating it as 16 single bytes, it is 4 blocks of 4 bytes each, as shown in Figure 2.4.

Figure 2.4 Schematic diagram of memory space

Obviously, only 4 bytes can be fetched at a time from 0x0, 0x4, 0x8, 0xC, etc., which are integer multiples of 4, and 4 bytes cannot be read at a time from an arbitrary address. Assume that a 4-byte int type of data is stored in the 4-byte memory starting at address 0. The schematic diagram is shown in Figure 2.5.

Figure 2.5 storing int data in memory alignment

Since the int type data is stored in block 0, the CPU only needs one memory access to complete the reading or writing of the data. Conversely, if the int type data is stored in the 4-byte memory space starting at address 1, the schematic diagram is shown in Figure 2.6.

Figure 2.6 storing int data in a way that is not aligned in memory

At this time, the data is stored in two blocks of block 0 and block 1. To complete the access to the data, two memory accesses must be performed. First, three bytes of the data are obtained by accessing block 0, and then the access block is accessed. 1 Get 1 byte of the data, and finally combine these bytes into a complete int type data by operation. It can be seen that if the data is stored in the unaligned memory space, the efficiency of the CPU will be greatly reduced. But in some specific microprocessors, it is not willing to do this kind of thing. In this case, a system exception occurs and it crashes directly. The specific rules for memory alignment are as follows:

(1) The first address of the memory space of each member variable of the structure must be an integer multiple of the smaller of the "alignment factor" and "actual length of the variable". Assuming that the memory space of the required variable is aligned in 4 bytes, the first address of the memory space must be an integer multiple of 4, and the address satisfying the condition is 0x0, 0x4, 0x8, 0xC...

(2) For the structure, after the alignment of each data member, the structure itself needs to be aligned, that is, the total size occupied by the structure should be the integer of the smaller of the "alignment coefficient" and "maximum data member length". Times.

In general, the alignment factor is the same as the word length of the microprocessor. For example, the alignment coefficient of a 32-bit microprocessor is 4 bytes. The actual length of the variable is related to its type. The method for calculating the length of the type is as follows:

The output of this program is: 1, 4, 4, 4, 8. Assuming the CPU is a 32-bit microprocessor with an alignment factor of 4, the structure variable data is defined as follows:

Each member of the structure begins to calculate from the first address of the structure (which is guaranteed by the compiler to meet the memory alignment requirement, assuming zero), and stores the members in order of definition. See Table 2.1 for details.

Table 2.1 stores each member in turn

The actual storage location is represented by [x, y], where x is the starting address and y is the ending address. If x is equal to y, it is directly represented by [x]. Taking member b as an example, its length is 2, which is smaller than the alignment coefficient. Therefore, if it is aligned by 2 bytes, its address must be a multiple of 2. If address 0 is already occupied by member a, only neighbor memory that meets the requirements can be used. Space [2, 3] stores member b. Space [1] can only be deprecated because it does not meet the requirements for storing member b. In particular, for an array member c, it cannot be treated as a whole when it is stored, that is, a member of length 2, which should be regarded as two members c[0] and c[1], respectively. It can be seen that the actual storage location is [0, 24], and the memory space of 1, 6, 7, 17, 18, 19 is discarded.

When all the members are stored, the structure itself needs to be aligned. That is, the size of the structure should also be an integer multiple of the number of aligned bytes. The number of aligned bytes is the longest member and the smaller value of the "alignment coefficient". . Here, the member whose longest length is the member d of the double type has a length of 8, which is larger than the alignment coefficient, so the structure itself is also aligned in 4 bytes, and the space occupied by the structure must be an integer multiple of 4. Although the current storage location is [0, 24], it only takes 25 bytes. Since the integer multiple of 4 must be satisfied, the space occupied by the structure is actually 28 bytes, that is, [0, 27]. Here's how to verify the size of the structure footprint:

Although the total length of all members is 19 bytes, the structure actually occupies 28 bytes, and the extra 9 bytes of space is the space reserved for memory alignment, namely 1, 6, 7, 17, 18, 19 , 25, 26, 27, divided into 4 segments: [1], [6, 7], [17, 19], [25, 27]. Looking at Table 2.1, the front of these wasted space is stored in char type data. Since char type data only occupies one byte, it often makes the next space not be used by other data of longer length.

In order to reduce the probability of memory waste, the member with the smallest length should be stored after the char type data. That is, when defining a structure, each member should be defined in order of increasing length. The definition of the example structure is defined as follows:

Similarly, each member is stored in turn, as shown in Table 2.2.

Table 2.2 stores each member in turn

The actual storage location of all members is [0, 19], and the memory space with an address of 5 in the middle is deprecated. Since the size of the structure is 20 bytes, it is already an integer multiple of 4, so no additional processing is required. The structure only wastes 1 byte of space and the usage rate reaches 95%. Obviously, by optimizing the order in which the members of the structure are defined, the memory waste can be greatly reduced while also meeting the memory alignment requirements.

> > > 2.2.2 Basic data types

Range value check

If min ≤ value ≤ max, the check () range value check function requires 3 int parameters value, min and max. Returns true if value is legal, false otherwise, see Listing 2.10 for details.

Listing 2.10 Implementation of the rangeCheck() range value check function (1)

Code cleanliness

rangeCheck is a very descriptive name because it better describes what the function does, so the value of a good name cannot be overestimated. If every example makes you feel good, it's neat code. The shorter the function, the more concentrated the function, the easier it is to get a good name. Longer names are not terrible, long and descriptive names are better than short, puzzling names. Choosing a descriptive name can help programmers clarify the design of the module. Retrieving a good name will often make the code refactor better.

From the point of view of code cleanliness, the optimal number of function parameters is 0 (zero parameter function), followed by single parameter function, and again is a two-parameter function, because the three-parameter function is avoided as much as possible. If you need more than three parameters, you need to have a good reason, otherwise don't do it anyway, because the parameters are too conceptual.

From a test point of view, the parameters are even more difficult, because writing a test case that ensures that the various combinations of parameters work properly, and that testing covers a combination of all possible values ​​is a daunting task. The output parameters are more difficult to understand than the input parameters, because people habitually believe that information is output from the function through the parameter input function, and the output parameters are often hard to understand. If the function seems to require two, three or more parameters, some of the parameters should be encapsulated as a structure class. such as:

Thus, the best way to reduce the function parameters is to do only one thing in a function. "The function does either do something or answer something!" A function should modify the state of an object, or return information about that object, and both can often be confusing.

2. Types and variables

Thanks to the structure, the shape parameters min and max of rangeCheck() can be transferred to the structure, which not only reduces one parameter, but also makes it easier to handle. such as:

This statement describes a structure consisting of two int type variables, not only the object scope that created the actual data, but also what constitutes the object, because it outlines how the structure stores the data. Obviously, range is a struct struct of type struct_Range if you add a typedef before the struct definition:

At this point, range becomes the type of the structure, ie range is equivalent to struct _Range. It is customary to capitalize the first character of the type name and lowercase the first character of the variable name. With the Range type, you can define both a range variable of Range type and a pointer variable pRange that points to the Range * type. Of course, you can also omit the type name _Range. such as:

Note that the structure has two meanings, one meaning "structure layout", the structure layout tells the compiler how to represent the data, but it does not let the compiler allocate space for the data. The next step is to create a structure variable, another layer of the structure, defined as follows:

The compiler executes this line of code to create a structure variable range, and the compiler uses Range to allocate space for the variable: an int type variable min and an int type variable max, which are combined with a name range. .

3. Initialization

Assuming the valid range of value values ​​is 0~9, you can use the macro named newRangeCheck to easily initialize the structure. such as:

The method of use is as follows:

The macro expands as follows:

It is equivalent to:

In essence, the effects of .min and .max are equivalent to the subscript of the Range structure. Although Range is a struct, both range.min and range.max are variables of type int, so you can use it like other int variables, for example, &(range.min).

Thus, if you initialize a static storage period structure, the value in the initialization list must be a constant expression. In the case of an automatic storage period, the values ​​in the initialization list may not be constants.

4. Interface and implementation

(1) Passing structural members

As long as the structure member is a data type with a single value, such as an int, char, float, double, or pointer, it can be passed as a parameter to a function that accepts that particular type. The implementation of rangeCheck() is detailed in Listing 2.11. .

Listing 2.11 Implementation of the rangeCheck() function (2)

The form of its call is as follows:

rangeCheck() neither knows nor cares whether the argument is a member of the struct. It only requires the incoming data to be of type int. If you need to modify the value of a member in the calling function in the called function, you must pass the member's address.

(2) Transfer structure

Although passing a structure is more complex than a single value, standard C also allows the structure to be used as a parameter. The implementation of the rangeCheck() function is detailed in Listing 2.11.

Listing 2.12 Implementation of the rangeCheck() function (3)

The form of its call is as follows:

Although the correct result can be obtained by this method, it is inefficient because the C-language parameter addressing method requires a copy of the parameter to be passed to the function. Suppose the members of the structure are an array of 128 bytes, or even a larger array. If you want to pass it as a parameter, you must copy the number of bytes used to the stack and discard it later.

(3) The address of the delivery structure

Suppose there is a set of such data stored in an array of structure members. Its data structure is as follows:

Obviously, the maximum value of the elements in the array can be found by passing the address of the structure (int *) & st as an argument to the formal parameter of iMax(). See Listing 2.13 for details.

Listing 2.13 Example of finding the maximum value of an element in an array

The following is a range value verifier, for example, to define a pointer variable pRange pointing to the structure, the initialization, assignment and the ordinary pointer variable are the same:

Unlike arrays, the structure name is not the address of the structure, so the & operator is added before the structure name, so pRange here is a pointer variable pointing to the range variable of the Range structure. Although the types of pRange, &range, and &range.min are different, but their values ​​are equal, then the following relationship is true:

Since the . operator has a higher precedence than the * operator, parentheses must be used. Here we focus on understanding that pRange is a pointer, pRange->min means that pRange points to the first member of the structure, so pRange->min is a variable of type int, and the implementation of the rangeCheck() function is shown in Listing 2.14.

Listing 2.14 Implementation of the rangeCheck() function (4)

rangeCheck() uses the pointer pRange pointing to Range as its argument, passing the address &range to the function, causing the pointer pRange to point to range, and then getting the values ​​of range.min and range.max with the -> operator. Note that you must use the & operator to get the address of the structure, which is different from the array name. The structure name is just an alias for its address.

The form of its call is as follows:

(4) Called with a function pointer

If you need to add a parity checker to evenly check the value, the data structure is as follows:

The implementation of the oddEvenCheck() function is detailed in Listing 2.15.

Listing 2.15 Implementation of the oddEvenCheck() function

When the system requires multiple validators, the caller will decide which function to call based on the actual situation at runtime. According to the dependency inversion principle, the best way is to use function pointers to isolate changes. Regardless of the validator, the same processing part is the legality judgment of the value value, so it is abstracted into a module. The variable values ​​and check parameters are handled by externally passed parameters. Since the various validators are of different types, you must use "void *pData" as a formal parameter to accept any type of data, that is, Range *pRange and OddEven *pOddEven are generalized to void *pData. The Validate type is defined as follows:

Where pData is a pointer to any validator parameter, value is the value to be verified, and the interface of the universal validator is shown in Listing 2.16.

Listing 2.16 General Validator Interface (validator.h)

Take the range value verifier as an example, the call form is as follows:

The function passed to the function this time is a pointer to the structure. The pointer is much smaller than the whole structure, so it is much more efficient to push it onto the stack. The implementation of the validator interface is shown in Listing 2.17.

Listing 2.17 Implementation of the validator interface (validator.c)

Since the types of pRange, pOddEven, and pData are different, you need to cast a type to pData to reference the members of the corresponding structure. Note that the author does not provide the complete code here, please supplement it.

Medical Atomization

The utility model relates to a medical atomization treatment and humidifying device belonging to the technical field of medical equipment and household appliances.


Professional Medical Atomization manufacturer is located in China, including Medical Vape,Dose Control Vape Pen,Supersonic Wave Vape, etc.2-2

Medical Atomization,Medical Vape,Dose Control Vape Pen,Supersonic Wave Vape

Shenzhen MASON VAP Technology Co., Ltd. , https://www.disposablevapepenfactory.com

Posted on