Without even knowing anything about the semantics of the fields,
we can notice that it would be hard to pack the data much tighter in
a binary format. The colon sentinel characters would have to have
functional equivalents taking at least as much space (usually
either count bytes or NULs). The per-user records would either have
to have terminators (which could hardly be shorter than a single
newline) or else be wastefully padded out to a fixed length.
Actually the prospects for saving space through binary encoding
pretty much vanish if you know the actual semantics of the data. The
numeric user ID (3rd) and group ID (4th) fields are integers, thus on
most machines a binary representation would be at least 4 bytes, and
longer than the text for values up to 999. But let's agree to ignore
this for now and suppose the best case that the numeric fields have a
0-255 range.
We could tighten up the numeric fields (3rd and 4th) by
collapsing the numerics to single bytes, and the password strings
(2nd) to an 8-bit encoding. On this example, that would give about an
8% size decrease.
That 8% of putative inefficiency buys us a lot. It avoids
putting an arbitrary limit on the range of the numeric fields. It
gives us the ability to modify the password file with any old text
editor of our choice, rather than having to build a specialized tool
to edit a binary format (though in the case of the password file
itself, we have to be extra careful about concurrent edits). And it
gives us the ability to do ad-hoc searches and filters and reports on
the user account information with text-stream tools such as
grep(1).
We do have to be a bit careful about not embedding a colon in
any of the textual fields. Good practice is to tell the file write
code to precede embedded colons with an escape character, and then to
tell the file read code to interpret it. Unix tradition favors
backslash for this use.
The fact that structural information is conveyed by field
position rather than an explicit tag makes this format faster
to read and write, but a bit rigid. If the set of properties
associated with a key is expected to change with any frequency,
one of the tagged formats described below might be a better choice.