Output Formatting
We have seen the basic ability to convert values to strings and output them to the console ([[stdout]]). Sometimes, however, we would like more control on the formatting used to output values. Fortunately we have formatted output functions that help us with that goal.
C-Style printf()
In the olden days when the C language was developed in the 1970's it contained a novel new way of formatting output called the printf()
function. This function allows one to specify a format string followed by values that will be substituted into the format string based on the type specifiers found in the format string. This approach has proven very popular and has been copied in many languages since.
With formatted output we must provide a format string that can contain literal text interspersed with type specifiers. The type specifiers are placeholders that identify where and how a value is to be formatted and inserted into the string on output. Type specifiers start with a percent sign (%
), followed by an optional flag, followed by field width and/or precision, followed by one or two characters identifying the type of the value (called the type specifier). Only the percent sign and the type specifier is required, the other components are optional.
In addition to the format string, we also provide the values to be formatted to the function. We can have multiple type specifiers in the format string as long as we provide the matching number of additional arguments. The first argument is formatted by the first type specifier, the second argument by the second type specifier, and so on.
Since C++ inherits the C standard libraries, we can use printf()
in our C++ programs.
Any other characters in the format string that are not part of a type specifier are printed verbatim. For example:
printf("Page count is %d\n", numPages); printf("PI: %.2f\n", pi); printf("Title and Page Count: %s %d\n", title, numPages); printf("%d of %d\n", currentPage, numPages);
Some of the type specifiers available are listed in the table below.
Specifier | Value Type | Description |
---|---|---|
d | signed int | Signed decimal |
u | unsigned int | Unsigned decimal |
x | unsigned int | Hexadecimal (lowercase) |
X | unsigned int | Hexadecimal (uppercase) |
ld | signed long | Signed decimal |
lu | unsigned long | Unsigned decimal |
lx | unsigned long | Hexadecimal (lowercase) |
lX | unsigned long | Hexadecimal (uppercase) |
f | double | Fixed notation |
e | double | Scientific notation (lowercase) |
E | double | Scientific notation (uppercase) |
g | double | Shortest representation: %e or %f |
G | double | Shortest representation: %E or %F |
c | int | Single character |
lc | wchar_t | Single wide character |
s | char* | C-String of characters |
ls | wchar_t* | C-String of wide characters |
p | void* | Pointer address in hexadecimal |
% | none | Outputs literal % |
The type specifiers d,u,x or X can be modified to print a short int (16-bit) by prefacing the print specifier with an 'h' as in 'hd' or 'hx'. They may also be modified to print a long int (64-bit) by prefacing the specifier with an 'l' as in 'ld' or 'lx'.
Flag | Description |
---|---|
- | Left-justify within the given field width; Right justification is the default. |
+ | Result is preceeded with a plus or minus sign (+ or -) even for positive numbers. By default, only negative numbers are preceded with a - sign. |
# | Used with o, x or X specifiers the value is preceeded with 0, 0x or 0X respectively. Used with e, E, f, g or G it forces the written output to contain a decimal point even if no more digits follow. By default, if no digits follow, no decimal point is written. |
0 | Left-pads the number with zeroes (0) instead of spaces |
The field width and precision are simply integer values separated by a period (.
). One may use either in the type specifier, both or none. Field width sets the number of characters that will be used to output the value padded with spaces if necessary to achieve the field width. The precision specifier is only useful for formatting floating point and string values. For floating point values it limits the number of decimal places after the decimal point. For strings it truncates the string to the specified number of characters.
#include <iostream> #include <string> using namespace std; int main(int argc, char **argv) { int const a = 326; int const b = -1; int const c = 2015; long const i1 = 65000; long const i2 = -2; long const i3 = 3261963; double const f1 = 3.1415926; double const f2 = 2.99792458e9; double const f3 = 1.234e-4; int const c1 = int('A'); int const c2 = int('B'); int const c3 = int('C'); string const s1 = "Apples"; string const s2 = "and"; string const s3 = "Bananas"; bool const b1 = true; bool const b2 = false; printf("Decimals: %d %d %d\n", a, b, c); printf("Unsigned Decimals: %u %u %u\n", a, b, c); printf("Hexadecimals: %#x %#x %#x\n", a, b, c); printf("Long Decimals: %ld %ld %ld\n", i1, i2, i3); printf("Long Hexadecimals: %016lx %016lx %016lx\n", i1, i2, i3); printf("Fixed FP: %f %f %f\n", f1, f2, f3); printf("Exponential FP: %e %e %e\n", f1, f2, f3); printf("General FP: %g %g %g\n", f1, f2, f3); printf("General FP with precision: %.2g %.2g %.2g\n", f1, f2, f3); printf("Boolean: %d %d\n", b1, b2); printf("Character: %c %c %c\n", c1, c2, c3); printf("String: %s %s %s\n", s1.c_str(), s2.c_str(), s3.c_str()); return 0; }
Output
Because printf()
is a C function not a C++ function, we can't pass C++ strings but must convert them to C strings using the c_str()
method of C++ strings.
C-Style sprintf()
Along with the printf()
function, the C language introduced a similar function called sprintf()
. Instead of printing a formatted string to the console ([[stdout]]), it formats and then returns a string. This more general form has many uses not only for console output but for [[GUI]] output and file output.
The sprintf()
function in C++ is a hold-over from the C language. As such it doesn't return a String but instead requires a buffer be passed in in the first argument. The buffer is an array of characters large enough to hold any possible formatted output specified. Be sure that the formmated output does not exceed the size of this buffer or very bad things will happen. A safer version, snprintf()
can be used that will limit the output so that it doesn't exceed the buffer. We will cover C-style strings and arrays later.
#include "Utils.hpp" #include <iostream> #include <string> using namespace std; int main(int argc, char **argv) { int const a = 326; int const b = -1; int const c = 2015; long const i1 = 65000; long const i2 = -2; long const i3 = 3261963; double const f1 = 3.1415926; double const f2 = 2.99792458e9; double const f3 = 1.234e-4; int const c1 = int('A'); int const c2 = int('B'); int const c3 = int('C'); string const s1 = "Apples"; string const s2 = "and"; string const s3 = "Bananas"; bool const b1 = true; bool const b2 = false; string s; s = Utils::sprintf("Decimals: %d %d %d", a, b, c); cout << s << endl; s = Utils::sprintf("Unsigned Decimals: %u %u %u", a, b, c); cout << s << endl; s = Utils::sprintf("Hexadecimals: %#x %#x %#x", a, b, c); cout << s << endl; s = Utils::sprintf("Long Decimals: %ld %ld %ld", i1, i2, i3); cout << s << endl; s = Utils::sprintf("Long Hexadecimals: %016lx %016lx %016lx", i1, i2, i3); cout << s << endl; s = Utils::sprintf("Fixed FP: %f %f %f", f1, f2, f3); cout << s << endl; s = Utils::sprintf("Exponential FP: %e %e %e", f1, f2, f3); cout << s << endl; s = Utils::sprintf("General FP: %g %g %g", f1, f2, f3); cout << s << endl; s = Utils::sprintf("General FP with precision: %.2g %.2g %.2g", f1, f2, f3); cout << s << endl; s = Utils::sprintf("Boolean: %d %d", b1, b2); cout << s << endl; s = Utils::sprintf("Character: %c %c %c", c1, c2, c3); cout << s << endl; s = Utils::sprintf("String: %s %s %s", s1.c_str(), s2.c_str(), s3.c_str()); cout << s << endl; return 0; }
Output
Because sprintf()
is a C function, we can't pass C++ strings but must convert them to C strings using the c_str()
method of strings.
String Interpolation
Many languages have the ability to replace variables written directly into strings. This variable substitution in strings is know as variable interpolation. Because C++ is a compiled language it doesn't support any form of variable interpolation.
Modern Message Formatting
The methods for formatting strings and output discuss so far have some limitations when it comes to localizing software. The positional approaches taken by the printf()
style functions poses difficulties to localization because often during translation the order of words and thus the substition specifiers must change but the hard-code argument list in our code can not change to match. Variable interpolation has its drawbacks because you are actually putting code into the strings. When externalizing the strings (removing them from the code and putting them in a separate file) necessary for localization, it is not desirable to export variables and expressions from our code to the localization file where they can be changed.
This is why message formatting approaches have been developed that use substitution specifiers that specify which argument to the message format function is to be used. This allows the substitution specifiers to be in a different order (and perhaps re-ordered during localization) than the formal arguments to the message format function. These substitution specifiers are also not executable code as is the case with variable interpolation so it is much safer to externalize from our program code as we will see in the section on Internationalization.
Modern mesasage formatting based on the Python format()
) function was approved for inclusion in the C++ 2020 standard but it has not made its way to many compilers yet. Once it appears in compilers all you will need to do is include <format>
to be able to use the std::format()
method.
There is already an implementation available at the [[Github C++ Modern Formatting Library]]. Start by downloading library in ZIP format by clicking on Releases on the left side of the page. UnZip the downloaded "fmt-10.1.1.zip" file to create the directory of source files named "fmt-10.1.1". Then in a console window execute these commands:
On Linux or Raspberry Pi if "cmake" doesn't run simply type: $ sudo apt install cmake to install it from the command line.
On MacOS X if "cmake" doesn't run you will need to install the a Unix package manager [[MacPorts]] or [[Homebrew]]. For example install MacPorts, then a new terminal window type: $ sudo port install cmake to install it from the command line.
Then open the file "fmt-master/test/scan-test.cc" with a text editor. Find the <time.h>
and <climits>
includes at the beginning of the file. Then add one line before and after those includes. With this modification you should be able to run "make". The "make test" command will have one failure that we will have to ignore for now. On Cygwin you don't need to use the "sudo" command to install so just type "make install" as the final command.
Once the format library is installed you will need to add an additional argument to the command line to the C++ complier to include this library in your programs. Simply add -lfmt
as illustrated below...
To use the library in your source code you will need to include <fmt/format.h>
at the top of your source files. Then include the namespace fmt
.
The format()
method uses a pair of curly brackets to identify substition placeholders in the format string. Each pair of curly brackets contains a number from 0 to the number of addtional arguments - 1. This number refers to the position in the argument list of the value that will be used in the substitution. This allows values in the argument list to be used in any order needed in the format string or even used more than once.
Following the position number in the substitution placeholder is an optional type specifier. The type specifier is separated from the position number by a colon. If you don't wish to use a type specifier then the type is assumed to be string. Some of the type specifiers available are listed in the table below.
Specifier | Value Type | Description |
---|---|---|
b | integer | binary |
d | integer | decimal |
o | integer | octal |
x | integer | hexadecimal (lowercase) |
X | integer | hexadecimal (uppercase) |
f | double | fixed notation |
e | double | scientific notation (lowercase) |
E | double | scientific notation (uppercase) |
g | double | shortest representation: e or f |
G | double | shortest representation: E or f |
c | integer | single character |
s | string | string of characters |
Like we have seen in printf()
, the format()
specifiers can take optional modifiers that change how the value is to be formatted. The full definition of the specifiers, including optional components is as follows:
The fill-and-align option is an optional fill character (which can be any character other than {
or }
), followed by one of the align options <
, >
, ^
. If no fill character is specified, then the space character is used.
Option | Description |
---|---|
< | Left-justify within the given field width. This is the default for non-numeric types. |
> | Right-justify within the given field width. This is the default for numeric types. |
^ | Aligns the value in the center of the field. |
The sign option is one of +
, -
or the space character.
Option | Description |
---|---|
+ | Signifies that a sign character should be output for all values (+ for positive and - for negative). |
- | Signifies that a sign character should be output for negative values only. This is the default for numeric types. |
<space> | Signifies that a leading space should be used for positive values and a - should be output for negative values.
|
The #
option causes alternate formatting to be used.
- For integral types, when binary, octal, or hexadecimal presentation type is used, the alternate form inserts the prefix (0b, 0, or 0x) into the output value after the sign character.
- For floating-point types, the alternate form causes the result of the format to always contain a decimal-point character, even if no digits follow it.
The 0
option pads the field with leading zeros (following any indication of sign or base) to the field width. If the 0
option and an align option both are used, the 0 option is ignored.
The width option is a positive decimal number. If present, it specifies the minimum field width. If the formatted value can not fit within the field width the entire value will be inserted causing the field to be larger than width.
The precision option is a .
followed by a non-negative decimal number. This option indicates the precision to use when formatting a value. For floating-point types, precision specifies the formatting precision, i.e. the number of places after the decimal point to display. For string types, it specifies how many characters from the string will be used.
#include <clocale> #include <codecvt> #include <fmt/format.h> #include <fmt/xchar.h> #include <iostream> #include <string> std::locale utf8loc(std::locale(), new std::codecvt_utf8<wchar_t>); using namespace std; int main(int argc, char **argv) { setlocale(LC_ALL, "en_US.UTF-8"); wcout.imbue(utf8loc); wcin.imbue(utf8loc); wstring const first = L"First"; wstring const middle = L"Middle"; wstring const lastname = L"Last"; wstring const left = L"Left"; wstring const center = L"Center"; wstring const right = L"Right"; wstring const favorite = L"kerfuffle"; long const i1 = 3261963; short const i2 = -42; double const fp1 = 3.1415926; double const fp2 = 2.99792458e9; double const fp3 = -1.234e-4; wstring s = fmt::format(L"{2}, {0} {1}", first, middle, lastname); wcout << s << endl; s = fmt::format(L"{0} {1} {2}", left, center, right); wcout << s << endl; s = fmt::format(L"Favorite number is {0}", i1); wcout << s << endl; s = fmt::format(L"Favorite FP is {0}", fp1); wcout << s << endl; wchar_t c = favorite[0]; wcout << fmt::format(L"Favorite c is {0:c}", c) << endl; wcout << fmt::format(L"Favorite 11c is |{0:11c}|", c) << endl; wcout << fmt::format(L"Favorite <11c is |{0:<11c}|", c) << endl; wcout << fmt::format(L"Favorite ^11c is |{0:^11c}|", c) << endl; wcout << fmt::format(L"Favorite >11c is |{0:>11c}|", c) << endl; wcout << fmt::format(L"Favorite .<11c is |{0:.<11c}|", c) << endl; wcout << fmt::format(L"Favorite _^11c is |{0:_^11c}|", c) << endl; wcout << fmt::format(L"Favorite >11c is |{0: >11c}|", c) << endl; c = 0x1F92F; wcout << fmt::format(L"Favorite emoji c is {0:c}", c) << endl; wcout << fmt::format(L"Favorite s is {0:s}", favorite) << endl; wcout << fmt::format(L"Favorite .2s is {0:.2s}", favorite) << endl; wcout << fmt::format(L"Favorite 11s is |{0:11s}|", favorite) << endl; wcout << fmt::format(L"Favorite 11.2s is |{0:11.2s}|", favorite) << endl; wcout << fmt::format(L"Favorite <11.2s is |{0:<11.2s}|", favorite) << endl; wcout << fmt::format(L"Favorite ^11.2s is |{0:^11.2s}|", favorite) << endl; wcout << fmt::format(L"Favorite >11.2s is |{0:>11.2s}|", favorite) << endl; wcout << fmt::format(L"Favorite .<11.2s is |{0:.<11.2s}|", favorite) << endl; wcout << fmt::format(L"Favorite *^11.2s is |{0:*^11.2s}|", favorite) << endl; wcout << fmt::format(L"Favorite ->11.2s is |{0:->11.2s}|", favorite) << endl; wcout << fmt::format(L"Favorite d is {0:d}", i1) << endl; wcout << fmt::format(L"Another d is {0:d}", i2) << endl; wcout << fmt::format(L"Favorite b is {0:b}", i1) << endl; wcout << fmt::format(L"Another B is {0:b}", i2) << endl; wcout << fmt::format(L"Favorite o is {0:o}", i1) << endl; wcout << fmt::format(L"Another o is {0:o}", i2) << endl; wcout << fmt::format(L"Favorite x is {0:x}", i1) << endl; wcout << fmt::format(L"Another X is {0:X}", i2) << endl; wcout << fmt::format(L"Favorite #b is {0:#b}", i1) << endl; wcout << fmt::format(L"Another #B is {0:#b}", i2) << endl; wcout << fmt::format(L"Favorite #o is {0:#o}", i1) << endl; wcout << fmt::format(L"Another #o is {0:#o}", i2) << endl; wcout << fmt::format(L"Favorite #x is {0:#x}", i1) << endl; wcout << fmt::format(L"Another #X is {0:#X}", i2) << endl; wcout << fmt::format(L"Favorite 11d is |{0:11d}|", i1) << endl; wcout << fmt::format(L"Favorite +11d is |{0:+11d}|", i1) << endl; wcout << fmt::format(L"Favorite 011d is |{0:011d}|", i1) << endl; wcout << fmt::format(L"Favorite 011x is |{0:011x}|", i1) << endl; wcout << fmt::format(L"Favorite #011x is |{0:#011x}|", i1) << endl; wcout << fmt::format(L"Favorite f is {0:f}", fp1) << endl; wcout << fmt::format(L"Another f is {0:f}", fp2) << endl; wcout << fmt::format(L"One more f is {0:f}", fp3) << endl; wcout << fmt::format(L"Favorite e is {0:e}", fp1) << endl; wcout << fmt::format(L"Another e is {0:e}", fp2) << endl; wcout << fmt::format(L"One more e is {0:e}", fp3) << endl; wcout << fmt::format(L"Favorite g is {0:g}", fp1) << endl; wcout << fmt::format(L"Another g is {0:g}", fp2) << endl; wcout << fmt::format(L"One more g is {0:g}", fp3) << endl; wcout << fmt::format(L"Favorite .2f is {0:.2f}", fp1) << endl; wcout << fmt::format(L"Another .2f is {0:.2f}", fp2) << endl; wcout << fmt::format(L"One more .2f is {0:.2f}", fp3) << endl; wcout << fmt::format(L"Favorite .2e is {0:.2e}", fp1) << endl; wcout << fmt::format(L"Another .2e is {0:.2e}", fp2) << endl; wcout << fmt::format(L"One more .2e is {0:.2e}", fp3) << endl; wcout << fmt::format(L"Favorite .2g is {0:.2g}", fp1) << endl; wcout << fmt::format(L"Another .2g is {0:.2g}", fp2) << endl; wcout << fmt::format(L"One more .2g is {0:.2g}", fp3) << endl; wcout << fmt::format(L"Favorite 15.2f is |{0:15.2f}|", fp1) << endl; wcout << fmt::format(L"Another 15.2e is |{0:15.2e}|", fp2) << endl; wcout << fmt::format(L"One more 15.2g is |{0:15.2g}|", fp3) << endl; return 0; }
Output
Questions
- What is the difference in output between the printf() specifier "%8d" and "%8x"?
- What is the difference in output between the printf() specifier "%8d" and "%-8d"?
- What is the difference in output between the printf() specifier "%8d" and "%+8d"?
- What is the difference in output between the printf() specifier "%f" and "%e"?
- What is the difference in output between the printf() specifier "%f" and "%g"?
- How many characters from the string are printed when using "%10.8s"?
- How many characters characters in total are printed when using "%10.8s"?
- When using "%5.3f" how is 34.5678 formatted?
- When using "%.1f%%" how is 12.345 formatted?
- What is the difference in output between the format() specifier {0:010x} and {0:+10x}?
- What is the difference in output between the format() specifier {0:+5.2f} and {0:-5.2f}?
- What is the difference in output between the format() specifier {0:^20s} and {0:<20s}
Projects
More ★'s indicate higher difficulty level.
- Aligning Characters and Strings with format()
- Aligning Characters and Strings with printf()
- Aligning Floating Point with format()
- Aligning Floating Point with printf()
- Aligning Integers with format()
- Aligning Integers with printf()
- Combined FICA Tax (Formatted)
References
- [[cplusplus.com printf]]