Output Formatting

We have seen the basic ability to convert values to strings and output them to the console ([[stdout]]). Sometimes, however, we would like more control on the formatting used to output values. Fortunately we have formatted output functions that help us with that goal.

C-Style printf()

In the olden days when the C language was developed in the 1970's it contained a novel new way of formatting output called the printf() function. This function allows one to specify a format string followed by values that will be substituted into the format string based on the type specifiers found in the format string. This approach has proven very popular and has been copied in many languages since.

With formatted output we must provide a format string that can contain literal text interspersed with type specifiers. The type specifiers are placeholders that identify where and how a value is to be formatted and inserted into the string on output. Type specifiers start with a percent sign (%), followed by an optional flag, followed by field width and/or precision, followed by one or two characters identifying the type of the value (called the type specifier). Only the percent sign and the type specifier is required, the other components are optional.

%[flag][width][.precision][type specifier]

In addition to the format string, we also provide the values to be formatted to the function. We can have multiple type specifiers in the format string as long as we provide the matching number of additional arguments. The first argument is formatted by the first type specifier, the second argument by the second type specifier, and so on. Since C++ inherits the C standard libraries, we can use printf() in our C++ programs.

printf() Illustration — printf() specifiers must match arguments in order

Any other characters in the format string that are not part of a type specifier are printed verbatim. For example:

printf("Page count is %d\n", numPages);
printf("PI: %.2f\n", pi);
printf("Title and Page Count: %s %d\n", title, numPages);
printf("%d of %d\n", currentPage, numPages);

Some of the type specifiers available are listed in the table below.

Frequently Used Type Specifiers
Specifier	Value Type	Description
d	signed int	Signed decimal
u	unsigned int	Unsigned decimal
x	unsigned int	Hexadecimal (lowercase)
X	unsigned int	Hexadecimal (uppercase)
ld	signed long	Signed decimal
lu	unsigned long	Unsigned decimal
lx	unsigned long	Hexadecimal (lowercase)
lX	unsigned long	Hexadecimal (uppercase)
f	double	Fixed notation
e	double	Scientific notation (lowercase)
E	double	Scientific notation (uppercase)
g	double	Shortest representation: %e or %f
G	double	Shortest representation: %E or %F
c	int	Single character
lc	wchar_t	Single wide character
s	char*	C-String of characters
ls	wchar_t*	C-String of wide characters
p	void*	Pointer address in hexadecimal
%	none	Outputs literal %

The type specifiers d,u,x or X can be modified to print a short int (16-bit) by prefacing the print specifier with an 'h' as in 'hd' or 'hx'. They may also be modified to print a long int (64-bit) by prefacing the specifier with an 'l' as in 'ld' or 'lx'.

Print Flags
Flag	Description
-	Left-justify within the given field width; Right justification is the default.
+	Result is preceeded with a plus or minus sign (+ or -) even for positive numbers. By default, only negative numbers are preceded with a - sign.
#	Used with o, x or X specifiers the value is preceeded with 0, 0x or 0X respectively. Used with e, E, f, g or G it forces the written output to contain a decimal point even if no more digits follow. By default, if no digits follow, no decimal point is written.
0	Left-pads the number with zeroes (0) instead of spaces

The field width and precision are simply integer values separated by a period (.). One may use either in the type specifier, both or none. Field width sets the number of characters that will be used to output the value padded with spaces if necessary to achieve the field width. The precision specifier is only useful for formatting floating point and string values. For floating point values it limits the number of decimal places after the decimal point. For strings it truncates the string to the specified number of characters.

OutputFormatting1.cpp

#include <iostream>
#include <string>

using namespace std;

int main(int argc, char **argv) {
	int const a = 326;
	int const b = -1;
	int const c = 2015;
	long const i1 = 65000;
	long const i2 = -2;
	long const i3 = 3261963;
	double const f1 = 3.1415926;
	double const f2 = 2.99792458e9;
	double const f3 = 1.234e-4;
	int const c1 = int('A');
	int const c2 = int('B');
	int const c3 = int('C');
	string const s1 = "Apples";
	string const s2 = "and";
	string const s3 = "Bananas";
	bool const b1 = true;
	bool const b2 = false;

	printf("Decimals: %d %d %d\n", a, b, c);
	printf("Unsigned Decimals: %u %u %u\n", a, b, c);
	printf("Hexadecimals: %#x %#x %#x\n", a, b, c);
	printf("Long Decimals: %ld %ld %ld\n", i1, i2, i3);
	printf("Long Hexadecimals: %016lx %016lx %016lx\n", i1, i2, i3);

	printf("Fixed FP: %f %f %f\n", f1, f2, f3);
	printf("Exponential FP: %e %e %e\n", f1, f2, f3);
	printf("General FP: %g %g %g\n", f1, f2, f3);
	printf("General FP with precision: %.2g %.2g %.2g\n", f1, f2, f3);

	printf("Boolean: %d %d\n", b1, b2);
	printf("Character: %c %c %c\n", c1, c2, c3);
	printf("String: %s %s %s\n", s1.c_str(), s2.c_str(), s3.c_str());
	return 0;
}

Output

$ g++ -std=c++17 OutputFormatting1.cpp -o OutputFormatting1 -lfmt $ ./OutputFormatting1 Decimals: 326 -1 2015 Unsigned Decimals: 326 4294967295 2015 Hexadecimals: 0x146 0xffffffff 0x7df Long Decimals: 65000 -2 3261963 Long Hexadecimals: 000000000000fde8 fffffffffffffffe 000000000031c60b Fixed FP: 3.141593 2997924580.000000 0.000123 Exponential FP: 3.141593e+00 2.997925e+09 1.234000e-04 General FP: 3.14159 2.99792e+09 0.0001234 General FP with precision: 3.1 3e+09 0.00012 Boolean: 1 0 Character: A B C String: Apples and Bananas

Because printf() is a C function not a C++ function, we can't pass C++ strings but must convert them to C strings using the c_str() method of C++ strings.

C-Style sprintf()

Along with the printf() function, the C language introduced a similar function called sprintf(). Instead of printing a formatted string to the console ([[stdout]]), it formats and then returns a string. This more general form has many uses not only for console output but for [[GUI]] output and file output.

The sprintf() function in C++ is a hold-over from the C language. As such it doesn't return a String but instead requires a buffer be passed in in the first argument. The buffer is an array of characters large enough to hold any possible formatted output specified. Be sure that the formmated output does not exceed the size of this buffer or very bad things will happen. A safer version, snprintf() can be used that will limit the output so that it doesn't exceed the buffer. We will cover C-style strings and arrays later.

OutputFormatting2.cpp

#include "Utils.hpp"
#include <iostream>
#include <string>

using namespace std;

int main(int argc, char **argv) {
	int const a = 326;
	int const b = -1;
	int const c = 2015;
	long const i1 = 65000;
	long const i2 = -2;
	long const i3 = 3261963;
	double const f1 = 3.1415926;
	double const f2 = 2.99792458e9;
	double const f3 = 1.234e-4;
	int const c1 = int('A');
	int const c2 = int('B');
	int const c3 = int('C');
	string const s1 = "Apples";
	string const s2 = "and";
	string const s3 = "Bananas";
	bool const b1 = true;
	bool const b2 = false;

	string s;

	s = Utils::sprintf("Decimals: %d %d %d", a, b, c);
	cout << s << endl;
	s = Utils::sprintf("Unsigned Decimals: %u %u %u", a, b, c);
	cout << s << endl;
	s = Utils::sprintf("Hexadecimals: %#x %#x %#x", a, b, c);
	cout << s << endl;
	s = Utils::sprintf("Long Decimals: %ld %ld %ld", i1, i2, i3);
	cout << s << endl;
	s = Utils::sprintf("Long Hexadecimals: %016lx %016lx %016lx", i1, i2, i3);
	cout << s << endl;

	s = Utils::sprintf("Fixed FP: %f %f %f", f1, f2, f3);
	cout << s << endl;
	s = Utils::sprintf("Exponential FP: %e %e %e", f1, f2, f3);
	cout << s << endl;
	s = Utils::sprintf("General FP: %g %g %g", f1, f2, f3);
	cout << s << endl;
	s = Utils::sprintf("General FP with precision: %.2g %.2g %.2g", f1, f2, f3);
	cout << s << endl;

	s = Utils::sprintf("Boolean: %d %d", b1, b2);
	cout << s << endl;
	s = Utils::sprintf("Character: %c %c %c", c1, c2, c3);
	cout << s << endl;
	s = Utils::sprintf("String: %s %s %s", s1.c_str(), s2.c_str(), s3.c_str());
	cout << s << endl;
	return 0;
}

Output

$ g++ -std=c++17 OutputFormatting2.cpp -o OutputFormatting2 -lfmt $ ./OutputFormatting2 Decimals: 326 -1 2015 Unsigned Decimals: 326 4294967295 2015 Hexadecimals: 0x146 0xffffffff 0x7df Long Decimals: 65000 -2 3261963 Long Hexadecimals: 000000000000fde8 fffffffffffffffe 000000000031c60b Fixed FP: 3.141593 2997924580.000000 0.000123 Exponential FP: 3.141593e+00 2.997925e+09 1.234000e-04 General FP: 3.14159 2.99792e+09 0.0001234 General FP with precision: 3.1 3e+09 0.00012 Boolean: 1 0 Character: A B C String: Apples and Bananas

Because sprintf() is a C function, we can't pass C++ strings but must convert them to C strings using the c_str() method of strings.

String Interpolation

Many languages have the ability to replace variables written directly into strings. This variable substitution in strings is know as variable interpolation. Because C++ is a compiled language it doesn't support any form of variable interpolation.

Modern Message Formatting

The methods for formatting strings and output discuss so far have some limitations when it comes to localizing software. The positional approaches taken by the printf() style functions poses difficulties to localization because often during translation the order of words and thus the substition specifiers must change but the hard-code argument list in our code can not change to match. Variable interpolation has its drawbacks because you are actually putting code into the strings. When externalizing the strings (removing them from the code and putting them in a separate file) necessary for localization, it is not desirable to export variables and expressions from our code to the localization file where they can be changed.

This is why message formatting approaches have been developed that use substitution specifiers that specify which argument to the message format function is to be used. This allows the substitution specifiers to be in a different order (and perhaps re-ordered during localization) than the formal arguments to the message format function. These substitution specifiers are also not executable code as is the case with variable interpolation so it is much safer to externalize from our program code as we will see in the section on Internationalization.

Modern mesasage formatting based on the Python format()) function was approved for inclusion in the C++ 2020 standard but it has not made its way to many compilers yet. Once it appears in compilers all you will need to do is include <format> to be able to use the std::format() method.

#include <iostream>
#include <format>
using namespace std;

int main(int argc, char **argv) {
	string s1 = format("{1} to {0}", "a", "z");
	cout << s1 << endl;
}

C++ 2020 Standard Example

There is already an implementation available at the [[Github C++ Modern Formatting Library]]. Start by downloading library in ZIP format by clicking on Releases on the left side of the page. UnZip the downloaded "fmt-10.1.1.zip" file to create the directory of source files named "fmt-10.1.1". Then in a console window execute these commands:

$ mkdir ~/Downloads/fmt-10.1.1-build $ cd ~/Downloads/fmt-10.1.1-build $ cmake ../fmt-10.1.1 $ make $ make test $ sudo make install

On Linux or Raspberry Pi if "cmake" doesn't run simply type: $ sudo apt install cmake to install it from the command line.

On MacOS X if "cmake" doesn't run you will need to install the a Unix package manager [[MacPorts]] or [[Homebrew]]. For example install MacPorts, then a new terminal window type: $ sudo port install cmake to install it from the command line.

On Windows if "cmake" doesn't run you will need to install it using the [[Cygwin]] setup program. Also make sure that the program "make" is installed as well.

Then open the file "fmt-master/test/scan-test.cc" with a text editor. Find the <time.h> and <climits> includes at the beginning of the file. Then add one line before and after those includes. With this modification you should be able to run "make". The "make test" command will have one failure that we will have to ignore for now. On Cygwin you don't need to use the "sudo" command to install so just type "make install" as the final command.

#define _XOPEN_SOURCE 500
#include <time.h>
#include <climits>
#include <string.h>

Modifications to scan-test.cc for Cygwin

Once the format library is installed you will need to add an additional argument to the command line to the C++ complier to include this library in your programs. Simply add -lfmt as illustrated below...

$ g++ -lfmt hello.cpp -o hello

To use the library in your source code you will need to include <fmt/format.h> at the top of your source files. Then include the namespace fmt.

#include <iostream>
#include <fmt/format.h>
using namespace std;
using namespace fmt;

int main(int argc, char **argv) {
	string s1 = format("{1} to {0}", "a", "z");
	cout << s1 << endl;
}

Github format() Library Example

The format() method uses a pair of curly brackets to identify substition placeholders in the format string. Each pair of curly brackets contains a number from 0 to the number of addtional arguments - 1. This number refers to the position in the argument list of the value that will be used in the substitution. This allows values in the argument list to be used in any order needed in the format string or even used more than once.

format() Illustration — format() specifiers can match any argument

Following the position number in the substitution placeholder is an optional type specifier. The type specifier is separated from the position number by a colon. If you don't wish to use a type specifier then the type is assumed to be string. Some of the type specifiers available are listed in the table below.

Frequently Used Type Specifiers
Specifier	Value Type	Description
b	integer	binary
d	integer	decimal
o	integer	octal
x	integer	hexadecimal (lowercase)
X	integer	hexadecimal (uppercase)
f	double	fixed notation
e	double	scientific notation (lowercase)
E	double	scientific notation (uppercase)
g	double	shortest representation: e or f
G	double	shortest representation: E or f
c	integer	single character
s	string	string of characters

Like we have seen in printf(), the format() specifiers can take optional modifiers that change how the value is to be formatted. The full definition of the specifiers, including optional components is as follows:

{[arg-pos]:[fill-and-align][sign][#][0][width][.precision][type specifier]}

The fill-and-align option is an optional fill character (which can be any character other than { or }), followed by one of the align options <, >, ^. If no fill character is specified, then the space character is used.

Align Options
Option	Description
`<`	Left-justify within the given field width. This is the default for non-numeric types.
`>`	Right-justify within the given field width. This is the default for numeric types.
`^`	Aligns the value in the center of the field.

The sign option is one of +, - or the space character.

Sign Options
Option	Description
+	Signifies that a sign character should be output for all values (`+` for positive and `-` for negative).
-	Signifies that a sign character should be output for negative values only. This is the default for numeric types.
<space>	Signifies that a leading space should be used for positive values and a `-` should be output for negative values.

The # option causes alternate formatting to be used.

For integral types, when binary, octal, or hexadecimal presentation type is used, the alternate form inserts the prefix (0b, 0, or 0x) into the output value after the sign character.
For floating-point types, the alternate form causes the result of the format to always contain a decimal-point character, even if no digits follow it.

The 0 option pads the field with leading zeros (following any indication of sign or base) to the field width. If the 0 option and an align option both are used, the 0 option is ignored.

The width option is a positive decimal number. If present, it specifies the minimum field width. If the formatted value can not fit within the field width the entire value will be inserted causing the field to be larger than width.

The precision option is a . followed by a non-negative decimal number. This option indicates the precision to use when formatting a value. For floating-point types, precision specifies the formatting precision, i.e. the number of places after the decimal point to display. For string types, it specifies how many characters from the string will be used.

OutputFormatting4.cpp

#include <clocale>
#include <codecvt>
#include <fmt/format.h>
#include <fmt/xchar.h>
#include <iostream>
#include <string>

std::locale utf8loc(std::locale(), new std::codecvt_utf8<wchar_t>);
using namespace std;

int main(int argc, char **argv) {
	setlocale(LC_ALL, "en_US.UTF-8");
	wcout.imbue(utf8loc);
	wcin.imbue(utf8loc);

	wstring const first = L"First";
	wstring const middle = L"Middle";
	wstring const lastname = L"Last";
	wstring const left = L"Left";
	wstring const center = L"Center";
	wstring const right = L"Right";
	wstring const favorite = L"kerfuffle";
	long const i1 = 3261963;
	short const i2 = -42;
	double const fp1 = 3.1415926;
	double const fp2 = 2.99792458e9;
	double const fp3 = -1.234e-4;

	wstring s = fmt::format(L"{2}, {0} {1}", first, middle, lastname);
	wcout << s << endl;
	s = fmt::format(L"{0} {1} {2}", left, center, right);
	wcout << s << endl;
	s = fmt::format(L"Favorite number is {0}", i1);
	wcout << s << endl;
	s = fmt::format(L"Favorite FP is {0}", fp1);
	wcout << s << endl;

	wchar_t c = favorite[0];
	wcout << fmt::format(L"Favorite c is {0:c}", c) << endl;
	wcout << fmt::format(L"Favorite 11c is   |{0:11c}|", c) << endl;
	wcout << fmt::format(L"Favorite <11c is  |{0:<11c}|", c) << endl;
	wcout << fmt::format(L"Favorite ^11c is  |{0:^11c}|", c) << endl;
	wcout << fmt::format(L"Favorite >11c is  |{0:>11c}|", c) << endl;
	wcout << fmt::format(L"Favorite .<11c is |{0:.<11c}|", c) << endl;
	wcout << fmt::format(L"Favorite _^11c is |{0:_^11c}|", c) << endl;
	wcout << fmt::format(L"Favorite  >11c is |{0: >11c}|", c) << endl;

	c = 0x1F92F;
	wcout << fmt::format(L"Favorite emoji c is {0:c}", c) << endl;

	wcout << fmt::format(L"Favorite s is {0:s}", favorite) << endl;
	wcout << fmt::format(L"Favorite .2s is {0:.2s}", favorite) << endl;
	wcout << fmt::format(L"Favorite 11s is     |{0:11s}|", favorite) << endl;
	wcout << fmt::format(L"Favorite 11.2s is   |{0:11.2s}|", favorite) << endl;
	wcout << fmt::format(L"Favorite <11.2s is  |{0:<11.2s}|", favorite) << endl;
	wcout << fmt::format(L"Favorite ^11.2s is  |{0:^11.2s}|", favorite) << endl;
	wcout << fmt::format(L"Favorite >11.2s is  |{0:>11.2s}|", favorite) << endl;
	wcout << fmt::format(L"Favorite .<11.2s is |{0:.<11.2s}|", favorite) << endl;
	wcout << fmt::format(L"Favorite *^11.2s is |{0:*^11.2s}|", favorite) << endl;
	wcout << fmt::format(L"Favorite ->11.2s is |{0:->11.2s}|", favorite) << endl;

	wcout << fmt::format(L"Favorite d is {0:d}", i1) << endl;
	wcout << fmt::format(L"Another d is {0:d}", i2) << endl;
	wcout << fmt::format(L"Favorite b is {0:b}", i1) << endl;
	wcout << fmt::format(L"Another B is {0:b}", i2) << endl;
	wcout << fmt::format(L"Favorite o is {0:o}", i1) << endl;
	wcout << fmt::format(L"Another o is {0:o}", i2) << endl;
	wcout << fmt::format(L"Favorite x is {0:x}", i1) << endl;
	wcout << fmt::format(L"Another X is {0:X}", i2) << endl;
	wcout << fmt::format(L"Favorite #b is {0:#b}", i1) << endl;
	wcout << fmt::format(L"Another #B is {0:#b}", i2) << endl;
	wcout << fmt::format(L"Favorite #o is {0:#o}", i1) << endl;
	wcout << fmt::format(L"Another #o is {0:#o}", i2) << endl;
	wcout << fmt::format(L"Favorite #x is {0:#x}", i1) << endl;
	wcout << fmt::format(L"Another #X is {0:#X}", i2) << endl;
	wcout << fmt::format(L"Favorite 11d is   |{0:11d}|", i1) << endl;
	wcout << fmt::format(L"Favorite +11d is  |{0:+11d}|", i1) << endl;
	wcout << fmt::format(L"Favorite 011d is  |{0:011d}|", i1) << endl;
	wcout << fmt::format(L"Favorite 011x is  |{0:011x}|", i1) << endl;
	wcout << fmt::format(L"Favorite #011x is |{0:#011x}|", i1) << endl;

	wcout << fmt::format(L"Favorite f is {0:f}", fp1) << endl;
	wcout << fmt::format(L"Another f is {0:f}", fp2) << endl;
	wcout << fmt::format(L"One more f is {0:f}", fp3) << endl;
	wcout << fmt::format(L"Favorite e is {0:e}", fp1) << endl;
	wcout << fmt::format(L"Another e is {0:e}", fp2) << endl;
	wcout << fmt::format(L"One more e is {0:e}", fp3) << endl;
	wcout << fmt::format(L"Favorite g is {0:g}", fp1) << endl;
	wcout << fmt::format(L"Another g is {0:g}", fp2) << endl;
	wcout << fmt::format(L"One more g is {0:g}", fp3) << endl;
	wcout << fmt::format(L"Favorite .2f is {0:.2f}", fp1) << endl;
	wcout << fmt::format(L"Another .2f is {0:.2f}", fp2) << endl;
	wcout << fmt::format(L"One more .2f is {0:.2f}", fp3) << endl;
	wcout << fmt::format(L"Favorite .2e is {0:.2e}", fp1) << endl;
	wcout << fmt::format(L"Another .2e is {0:.2e}", fp2) << endl;
	wcout << fmt::format(L"One more .2e is {0:.2e}", fp3) << endl;
	wcout << fmt::format(L"Favorite .2g is {0:.2g}", fp1) << endl;
	wcout << fmt::format(L"Another .2g is {0:.2g}", fp2) << endl;
	wcout << fmt::format(L"One more .2g is {0:.2g}", fp3) << endl;
	wcout << fmt::format(L"Favorite 15.2f is |{0:15.2f}|", fp1) << endl;
	wcout << fmt::format(L"Another 15.2e is  |{0:15.2e}|", fp2) << endl;
	wcout << fmt::format(L"One more 15.2g is |{0:15.2g}|", fp3) << endl;
	return 0;
}

Output

$ g++ -std=c++17 OutputFormatting4.cpp -o OutputFormatting4 -lfmt $ ./OutputFormatting4 Last, First Middle Left Center Right Favorite number is 3261963 Favorite FP is 3.1415926 Favorite c is k Favorite 11c is |k | Favorite <11c is |k | Favorite ^11c is | k | Favorite >11c is | k| Favorite .<11c is |k..........| Favorite _^11c is |_____k_____| Favorite >11c is | k| Favorite emoji c is 🤯 Favorite s is kerfuffle Favorite .2s is ke Favorite 11s is |kerfuffle | Favorite 11.2s is |ke | Favorite <11.2s is |ke | Favorite ^11.2s is | ke | Favorite >11.2s is | ke| Favorite .<11.2s is |ke.........| Favorite *^11.2s is |****ke*****| Favorite ->11.2s is |---------ke| Favorite d is 3261963 Another d is -42 Favorite b is 1100011100011000001011 Another B is -101010 Favorite o is 14343013 Another o is -52 Favorite x is 31c60b Another X is -2A Favorite #b is 0b1100011100011000001011 Another #B is -0b101010 Favorite #o is 014343013 Another #o is -052 Favorite #x is 0x31c60b Another #X is -0X2A Favorite 11d is | 3261963| Favorite +11d is | +3261963| Favorite 011d is |00003261963| Favorite 011x is |0000031c60b| Favorite #011x is |0x00031c60b| Favorite f is 3.141593 Another f is 2997924580.000000 One more f is -0.000123 Favorite e is 3.141593e+00 Another e is 2.997925e+09 One more e is -1.234000e-04 Favorite g is 3.14159 Another g is 2.99792e+09 One more g is -0.0001234 Favorite .2f is 3.14 Another .2f is 2997924580.00 One more .2f is -0.00 Favorite .2e is 3.14e+00 Another .2e is 3.00e+09 One more .2e is -1.23e-04 Favorite .2g is 3.1 Another .2g is 3e+09 One more .2g is -0.00012 Favorite 15.2f is | 3.14| Another 15.2e is | 3.00e+09| One more 15.2g is | -0.00012|

Questions

What is the difference in output between the printf() specifier "%8d" and "%8x"?
What is the difference in output between the printf() specifier "%8d" and "%-8d"?
What is the difference in output between the printf() specifier "%8d" and "%+8d"?
What is the difference in output between the printf() specifier "%f" and "%e"?
What is the difference in output between the printf() specifier "%f" and "%g"?
How many characters from the string are printed when using "%10.8s"?
How many characters characters in total are printed when using "%10.8s"?
When using "%5.3f" how is 34.5678 formatted?
When using "%.1f%%" how is 12.345 formatted?
What is the difference in output between the format() specifier {0:010x} and {0:+10x}?
What is the difference in output between the format() specifier {0:+5.2f} and {0:-5.2f}?
What is the difference in output between the format() specifier {0:^20s} and {0:<20s}

Projects

More ★'s indicate higher difficulty level.

References

[[cplusplus.com printf]]