E | J N Q | R W X | D S Y | F T | A M | C I V | B K U | L O P | G H Z e | j n q | r w x | d s y | f t | a m | c i v | b k u | l o p | g h z 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
We want to use this mapping for encoding telephone numbers by words, so that it becomes easier to remember the numbers.
Only exactly each encoding that is possible from this dictionary and that matches the phone number exactly shall be printed. Thus, possibly nothing is printed at all. The words in the dictionary contain letters (capital or small, but the difference is ignored in the sorting), dashes (-) and double quotes ("). For the encoding only the letters are used, but the words must be printed in exactly the form given in the dictionary. Leading non-letters do not occur in the dictionary. (The quotes indicate Umlaut characters, but this fact is irrelevant.)
Encodings of phone numbers can consist of a single word or of multiple words separated by spaces. The encodings are built word by word from left to right. If and only if at a particular point no word at all from the dictionary can be inserted, a single digit from the phone number can be copied to the encoding instead. Two subsequent digits are never allowed, though. To put it differently: In a partial encoding that currently covers k digits, digit k+1 is encoded by itself if and only if, first, digit k was not encoded by a digit and, second, there is no word in the dictionary that can be used in the partial encoding starting at digit k+1.
Your program must work on a series of phone numbers; for each encoding that it finds, it must print the phone number followed by a colon, a single(!) space, and the encoding on one line; trailing spaces are not allowed.
All remaining ambiguities in this specification will be resolved by the following example. (Still remaining ambiguities are intended degrees of freedom.)
Dictionary (in file test.w):
an blau Bo" Boot bo"s da Fee fern Fest fort je jemand mir Mix Mixer Name neu o"d Ort so Tor Torf Wasser
Phone number list (in file test.t):
112 5624-82 4824 0721/608-4067 10/783--5 1078-913-5 381482 04824
Program start command (must have this form!):
phonecode test.w test.t
Corresponding correct program output (to stdout):
5624-82: mir Tor 5624-82: Mix Tor 4824: Torf 4824: fort 4824: Tor 4 10/783--5: neu o"d 5 10/783--5: je bo"s 5 10/783--5: je Bo" da 381482: so 1 Tor 04824: 0 Torf 04824: 0 fort 04824: 0 Tor 4Any other output would be wrong (except for different ordering of the lines).
Wrong outputs for the above example would be e.g.
562482: Mix Tor
, because the formatting of the phone number is
incorrect,
10/783--5: je bos 5
, because the formatting of the second word
is incorrect,
4824: 4 Ort
, because in place of the first digit the words
Torf, fort, Tor could be used,
1078-913-5: je Bo" 9 1 da
, since there are two subsequent
digits in the encoding,
04824: 0 Tor
, because the encoding does not cover the whole
phone number, and
5624-82: mir Torf
, because the encoding is longer than the
phone number.
The above data are available to you in the files test.w (dictionary), test.t (telephone numbers) and test.out (program output).
Work carefully and deliver a high quality program. In particular, thoroughly comment your source code (design ideas etc.).
The focus during program construction shall be on correctness. Generate exactly the right output format. Do not generate additional output. I will automatically test your program with possibly hundreds of thousands of phone numbers and it should not make a single mistake -- in particular it must not crash. Take youself as much time as is required to ensure correctness.
Your program must be run time efficient in so far that it analyzes only a small fraction of all dictionary entries in each word appending step. The dictionary must be read into main memory entirely, but you must not do the same with the phone number file, as that may be arbitrarily large.
Your program needs not be robust against incorrect formats of the dictionary file or the phone number file.