\chapter{Classic ciphers} \label{chapter:classic} Modern cryptography has come a long way. In his excellent book on cryptography, Singh traces it back to at least 5th century B.C., to the times of Herodotus and the ancient Greeks~\cite{Singh:1999:CBE}. That's some 2500 years ago, and surely we do not use those methods anymore in modern day cryptography. However, the basic techniques are still relevant for appreciating the art of secret writing. Shift ciphers\indShiftcipher construct the \glosCiphertext ciphertext\indCiphertext from the \glosPlaintext plaintext\indPlaintext\ by means of a predefined {\em shifting} operation,\glosCipherkey where the cipherkey of a particular shift algorithm defines the shift amount of the cipher.\indCipherkey Transposition ciphers work by keeping the plaintext the same, but {\em rearrange} the order of the characters according to a certain rule. The cipherkey is essentially the description of how this transposition is done.\indTranspositioncipher Substitution ciphers\indSubstitutioncipher generalize shifts and transpositions, allowing one to substitute arbitrary codes for plaintext elements. In this chapter, we will study several examples of these techniques and see how we can code them in Cryptol. In general, ciphers boil down to pairs of functions \emph{encrypt} and \emph{decrypt} which ``fit together'' in the appropriate way. Arguing that a cryptographic function is \emph{correct} is subtle. Correctness of cryptography is determined by cryptanalyses by expert cryptographers. Each kind of cryptographic primitive (i.e., a hash, a symmetric cipher, an asymmetric cipher, etc.) has a set of expected properties, many of which can only be discovered and proven by hand through a lot of hard work. Thus, to check the correctness of a cryptographic function, a best practice for Cryptol use is to encode as many of these properties as one can in Cryptol itself and use Cryptol's validation and verification capabilities, discussed later in~\autoref{cha:high-assur-progr}. For example, the fundamental property of most ciphers is that encryption and decryption are inverses of each other. To check the correctness of an \emph{implementation} $I$ of a cryptographic function $C$ means that one must show that the implementation $I$ behaves as the specification ($C$) stipulates. In the context of cryptography, the minimal conformance necesssary is that $I$'s output \emph{exactly} conforms to the output characterized by $C$. But just because a cryptographic implementation is \emph{functionally correct} does not mean it is \emph{secure}. The subtleties of an implementation can leak all kinds of information that harm the security of the cryptography, including abstraction leaking of sensitive values, timing attacks, side-channel attacks, etc. These kinds of properties cannot currently be expressed or reasoned about in Cryptol. Also, Cryptol does \emph{not} give the user any feedback on the \emph{strength} of a given (cryptographic) algorithm. While this is an interesting and useful feature, it is not part of Cryptol's current capabilities. %===================================================================== \section{Caesar's cipher} \label{sec:caesar} \sectionWithAnswers{Caesar's cipher}{sec:caesar} Caesar's cipher (a.k.a. Caesar's shift) is one of the simplest ciphers. The letters in the plaintext\indPlaintext are shifted by a fixed number of elements down the alphabet.\indCaesarscipher For instance, if the shift is 2, {\tt A} becomes {\tt C}, {\tt B} becomes {\tt D}, and so on. Once we run out of letters, we circle back to {\tt A}; so {\tt Y} becomes {\tt A}, and {\tt Z} becomes {\tt B}. Coding Caesar's cipher in Cryptol is quite straightforward (recall from Section~\ref{sec:tsyn} that a {\tt String n} is simply a sequence of n 8-bit words.):\indTSString \begin{code} caesar : {n} ([8], String n) -> String n caesar (s, msg) = [ shift x | x <- msg ] where map = ['A' .. 'Z'] <<< s shift c = map @ (c - 'A') \end{code} In this definition, we simply get a message {\tt msg} of type {\tt String n}, and perform a {\tt shift} operation on each one of the elements. The {\tt shift} function is defined locally in the {\tt where}-clause.\indWhere To compute the shift, we first find the distance of the letter from the character {\tt 'A'} (via {\tt c - 'A'}), and look it up in the mapping imposed by the shift. The {\tt map} is simply the alphabet rotated to the left by the shift amount, {\tt s}. Note how we use the enumeration {\tt ['A' .. 'Z']} to get all the letters in the alphabet.\indEnum \begin{Exercise}\label{ex:caesar:0} What is the map corresponding to a shift of 2? Use Cryptol's \verb+<<<+\indRotLeft to compute it. You can use the command {\tt :set ascii=on}\indSettingASCII to print strings in ASCII, like this: \begin{Verbatim} Cryptol> :set ascii=on Cryptol> "Hello World" "Hello World" \end{Verbatim} Why do we use a left-rotate, instead of a right-rotate? \end{Exercise} \begin{Answer}\ansref{ex:caesar:0} Here is the alphabet and the corresponding shift-2 Caesar's alphabet: \begin{verbatim} Cryptol> ['A'..'Z'] "ABCDEFGHIJKLMNOPQRSTUVWXYZ" Cryptol> ['A'..'Z'] <<< 2 "CDEFGHIJKLMNOPQRSTUVWXYZAB" \end{verbatim} We use a left rotate to get the characters lined up correctly, as illustrated above. \indRotLeft\indRotRight \end{Answer} \begin{Exercise}\label{ex:caesar:1} Use the above definition to encrypt the message {\tt "ATTACKATDAWN"} by shifts 0, 3, 12, and 52. What happens when the shift is a multiple of 26? Why? \end{Exercise} \begin{Answer}\ansref{ex:caesar:1} Here are Cryptol's responses: \begin{Verbatim} Cryptol> caesar (0, "ATTACKATDAWN") "ATTACKATDAWN" Cryptol> caesar (3, "ATTACKATDAWN") "DWWDFNDWGDZQ" Cryptol> caesar (12, "ATTACKATDAWN") "MFFMOWMFPMIZ" Cryptol> caesar (52, "ATTACKATDAWN") "ATTACKATDAWN" \end{Verbatim} If the shift is a multiple of 26 (as in 0 and 52 above), the letters will cycle back to their original values, so encryption will leave the message unchanged. Users of the Caesar's cipher should be careful about picking the shift amount! \end{Answer} \begin{Exercise}\label{ex:caesar:2} Write a function {\tt dCaesar} which will decrypt a ciphertext constructed by a Caesar's cipher. It should have the same signature as {\tt caesar}. Try it on the examples from the previous exercise. \end{Exercise} \begin{Answer}\ansref{ex:caesar:2} The code is almost identical, except we need to use a right rotate:\indRotRight \begin{code} dCaesar : {n} ([8], String n) -> String n dCaesar (s, msg) = [ shift x | x <- msg ] where map = ['A' .. 'Z'] >>> s shift c = map @ (c - 'A') \end{code} % dCaesar : {n} ([8], String n) -> String n % dCaesar (s, msg) = [ shift x | x <- msg ] % where map = ['A' .. 'Z'] >>> s % shift c = map @ (c - 'A') We have: \begin{Verbatim} Cryptol> caesar (12, "ATTACKATDAWN") "MFFMOWMFPMIZ" Cryptol> dCaesar (12, "MFFMOWMFPMIZ") "ATTACKATDAWN" \end{Verbatim} \end{Answer} \begin{Exercise}\label{ex:caesar:3} Observe that the shift amount in a Caesar cipher is very limited: Any shift of {\tt d} is equivalent to a shift by {\tt d \% 26}. (For instance shifting by 12 and 38 is the same thing, due to wrap around at 26.) Based on this observation, how strong do you think the Caesar's cipher is? Describe a simple attack that will recover the plaintext and automate it using Cryptol. Use your function to crack the ciphertext {\tt JHLZHYJPWOLYPZDLHR}. \end{Exercise} \begin{Answer}\ansref{ex:caesar:3} For the Caesar's cipher, the only good shifts are $1$ through $25$, since shifting by $0$ would return the plaintext unchanged, and any shift amount {\tt d} that is larger than $26$ and over is essentially the same as shifting by {\tt d \% 26} due to wrap around. Therefore, all it takes to break the Caesar cipher is to try the sizes $1$ through $25$, and see if we have a valid message. We can automate this in Cryptol by returning all possible plaintexts using these shift amounts: \begin{code} attackCaesar : {n} (String n) -> [25](String n) attackCaesar msg = [ dCaesar(i, msg) | i <- [1 .. 25] ] \end{code} If we apply this function to {\tt JHLZHYJPWOLYPZDLHR}, we get: \begin{Verbatim} Cryptol> :set ascii=on Cryptol> attackCaesar "JHLZHYJPWOLYPZDLHR", ["IGKYGXIOVNKXOYCKGQ", "HFJXFWHNUMJWNXBJFP", "GEIWEVGMTLIVMWAIEO" "FDHVDUFLSKHULVZHDN", "ECGUCTEKRJGTKUYGCM", "DBFTBSDJQIFSJTXFBL" "CAESARCIPHERISWEAK", "BZDRZQBHOGDQHRVDZJ", "AYCQYPAGNFCPGQUCYI" "ZXBPXOZFMEBOFPTBXH", "YWAOWNYELDANEOSAWG", "XVZNVMXDKCZMDNRZVF" "WUYMULWCJBYLCMQYUE", "VTXLTKVBIAXKBLPXTD", "USWKSJUAHZWJAKOWSC" "TRVJRITZGYVIZJNVRB", "SQUIQHSYFXUHYIMUQA", "RPTHPGRXEWTGXHLTPZ" "QOSGOFQWDVSFWGKSOY", "PNRFNEPVCUREVFJRNX", "OMQEMDOUBTQDUEIQMW" "NLPDLCNTASPCTDHPLV", "MKOCKBMSZROBSCGOKU", "LJNBJALRYQNARBFNJT" "KIMAIZKQXPMZQAEMIS"] \end{Verbatim} If you skim through the potential ciphertexts, you will see that the $7^{th}$ entry is probably the one we are looking for. Hence the key must be $7$. Indeed, the message is {\tt CAESARCIPHERISWEAK}. \end{Answer} \begin{Exercise}\label{ex:caesar:4} One classic trick to strengthen ciphers is to use multiple keys. By repeatedly encrypting the plaintext multiple times we can hope that it will be more resistant to attacks. Do you think this scheme might make the Caesar cipher stronger? \end{Exercise} \begin{Answer}\ansref{ex:caesar:4} No. Using two shifts $d_1$ and $d_2$ is essentially the same as using just one shift with the amount $d_1 + d_2$. Our attack function would work just fine on this schema as well. In fact, we wouldn't even have to know how many rounds of encryption was applied. Multiple rounds is just as weak as a single round when it comes to breaking the Caesar's cipher. \end{Answer} \begin{Exercise}\label{ex:caesar:5} What happens if you pass {\tt caesar} a plaintext that has non-uppercase letters in it? (Let's say a digit.) How can you fix this deficiency? \end{Exercise} \begin{Answer}\ansref{ex:caesar:5} In this case we will fail to find a mapping: \begin{Verbatim} Cryptol> caesar (3, "12") ... index of 240 is out of bounds (valid range is 0 thru 25). \end{Verbatim} What happened here is that Cryptol computed the offset {\tt '1' - 'A'} to obtain the $8$-bit index $240$ (remember, modular arithmetic!), but our alphabet only has $26$ entries, causing the out-of-bounds error. \todo[inline]{Say something about how to guarantee that such errors are impossible. (Use of preconditions, checking and proving safety, etc.)} We can simply remedy this problem by allowing our alphabet to contain all $8$-bit numbers:\indRotLeft \begin{code} caesar' : {n} ([8], String n) -> String n caesar' (s, msg) = [ shift x | x <- msg ] where map = [0 .. 255] <<< s shift c = map @ c \end{code} Note that we no longer have to subtract {\tt 'A'}, since we are allowing a much wider range for our plaintext and ciphertext. (Another way to put this is that we are subtracting the value of the first element in the alphabet, which happens to be 0 in this case! Consequently, the number of ``good'' shifts increase from $25$ to $255$.) The change in {\tt dCaesar'} is analogous:\indRotRight \begin{code} dCaesar' : {n} ([8], String n) -> String n dCaesar' (s, msg) = [ shift x | x <- msg ] where map = [0 .. 255] >>> s shift c = map @ c \end{code} \end{Answer} %===================================================================== \section{\texorpdfstring{Vigen\`{e}re}{Vigenere} cipher} \label{sec:vigenere} \sectionWithAnswers{\texorpdfstring{Vigen\`{e}re}{Vigenere} cipher}{sec:vigenere} The Vigen\`{e}re cipher is a variation on the Caesar's cipher, where one uses multiple shift amounts according to a keyword~\cite{wiki:vigenere}.\indVigenere Despite its simplicity, it earned the notorious description {\em le chiffre ind\`{e}chiffrable} (``the indecipherable cipher'' in French), as it was unbroken for a long period of time. It was very popular in the 16th century and onwards, only becoming routinely breakable by mid-19th century or so. To illustrate the operation of the Vigen\`{e}re cipher, let us consider the plaintext {\tt ATTACKATDAWN}. The cryptographer picks a key, let's say {\tt CRYPTOL}. We line up the plaintext and the key, repeating the key as much as as necessary, as in the top two lines of the following: \begin{tabbing} \hspace*{2cm} \= Ciphertext: \hspace*{.5cm} \= {\tt CKRPVYLVUYLG} \kill \> Plaintext : \> {\tt ATTACKATDAWN} \\ \> Cipherkey : \> {\tt CRYPTOLCRYPT} \\ \> Ciphertext: \> {\tt CKRPVYLVUYLG} \end{tabbing} We then proceed pair by pair, shifting the plaintext character by the distance implied by the corresponding key character. The first pair is {\tt A}-{\tt C}. Since {\tt C} is two positions away from {\tt A} in the alphabet, we shift {\tt A} by two positions, again obtaining {\tt C}. The second pair {\tt T}-{\tt R} proceeds similarly: Since {\tt R} is 17 positions away from {\tt A}, we shift {\tt T} down 17 positions, wrapping around {\tt Z}, obtaining {\tt K}. Proceeding in this fashion, we get the ciphertext {\tt CKRPVYLVUYLG}. Note how each step of the process is a simple application of the Caesar's cipher.\indCaesarscipher \begin{Exercise}\label{ex:vigenere:0} One component of the Vigen\`{e}re cipher is the construction of the repeated key. Write a function {\tt cycle} with the following signature: \begin{code} cycle : {n, a} (fin n, n >= 1) => [n]a -> [inf]a \end{code} such that it returns the input sequence appended to itself repeatedly, turning it into an infinite sequence. Why do we need the predicate {\tt n >= 1}?\indPredicates \end{Exercise} \begin{Answer}\ansref{ex:vigenere:0} Here is one way to define {\tt cycle}, using a recursive definition: \begin{code} cycle xs = xss where xss = xs # xss \end{code} We have: \begin{Verbatim} Cryptol> cycle [1 .. 3] [1, 2, 3, 1, 2, ...] \end{Verbatim} If we do not have the {\tt n >= 1} predicate, then we can pass {\tt cycle} the empty sequence, which would cause an infinite loop emitting nothing. The predicate {\tt n >= 1} makes sure the input is non-empty, guaranteeing that {\tt cycle} can produce the infinite sequence. \end{Answer} \begin{Exercise}\label{ex:vigenere:1} Program the Vigen\`{e}re cipher in Cryptol. It should have the signature: \begin{code} vigenere : {n, m} (fin n, n >= 1) => (String n, String m) -> String m \end{code} where the first argument is the key and the second is the plaintext. Note how the signature ensures that the input string and the output string will have precisely the same number of characters, {\tt m}. \lhint{Use Caesar's cipher repeatedly.} \end{Exercise} \begin{Answer}\ansref{ex:vigenere:1} \begin{code} vigenere (key, pt) = join [ caesar (k - 'A', [c]) | c <- pt | k <- cycle key ] \end{code} Note the shift is determined by the distance from the letter {\tt 'A'} for each character. Here is the cipher in action: \begin{Verbatim} Cryptol> vigenere ("CRYPTOL", "ATTACKATDAWN") "CKRPVYLVUYLG" \end{Verbatim} \end{Answer} \begin{Exercise}\label{ex:vigenere:2} Write the decryption routine for Vigen\`{e}re. Then decode \\ {\tt "XZETGSCGTYCMGEQGAGRDEQC"} with the key {\tt "CRYPTOL"}. \end{Exercise} \begin{Answer}\ansref{ex:vigenere:2} Following the lead of the encryption, we can rely on {\tt dCaesar}: \begin{code} dVigenere : {n, m} (fin n, n >= 1) => (String n, String m) -> String m dVigenere (key, pt) = join [ dCaesar (k - 'A', [c]) | c <- pt | k <- cycle key ] \end{code} The secret code is: \begin{Verbatim} Cryptol> dVigenere ("CRYPTOL", "XZETGSCGTYCMGEQGAGRDEQC") "VIGENERECANTSTOPCRYPTOL" \end{Verbatim} \end{Answer} \begin{Exercise}\label{ex:vigenere:3} A known-plaintext attack\indKnownPTAttack is one where an attacker obtains a plaintext-ciphertext pair, without the key. If the attacker can figure out the key based on this pair then he can break all subsequent communication until the key is replaced. Describe how one can break the Vigen\`{e}re cipher if a plaintext-ciphertext pair is known. \end{Exercise} \begin{Answer}\ansref{ex:vigenere:3} All it takes is to decrypt using using the plaintext as the key and message as the cipherkey. Here is this process in action. Recall from the previous exercise that encrypting {\tt ATTACKATDAWN} by the key {\tt CRYPTOL} yields {\tt CKRPVYLVUYLG}. Now, if an attacker knows that {\tt ATTACKATDAWN} and {\tt CKRPVYLVUYLG} form a pair, he/she can find the key simply by:\indVigenere \begin{Verbatim} Cryptol> dVigenere ("ATTACKATDAWN", "CKRPVYLVUYLG") "CRYPTOLCRYPT" \end{Verbatim} Note that this process will not always tell us what the key is precisely. It will only be the key repeated for the given message size. For sufficiently large messages, or when the key does not repeat any characters, however, it would be really easy for an attacker to glean the actual key from this information. This trick works since the act of using the plaintext as the key and the ciphertext as the message essentially reverses the shifting process, revealing the shift amounts for each pair of characters. The same attack would essentially work for the Caesar's cipher as well, where we would only need one character to crack it.\indCaesarscipher \end{Answer} %%%% Way too complicated for the intro.. skipping for now %% \section{Rail fence cipher} %% \lable{sec:railfence} %% \sectionWithAnswers{Rail fence cipher}{sec:railfence}\indRailFence %% The $k$-rail fence cipher is a simple example of a transposition %% cipher\indTranspositioncipher, where the text is written along {\em %% k}-lines in a zig-zag fashion. For instance, to encrypt {\tt %% ATTACKATDAWN} using a 3-rail fence, we construct the following %% text: %% \begin{Verbatim} %% A . . . C . . . D . . . %% . T . A . K . T . A . N %% . . T . . . A . . . W . %% \end{Verbatim} %% going down and up the 3 fences in a zigzag fashion. We then read %% the ciphertext\indCiphertext line by line to obtain: %% \begin{Verbatim} %% ACDTAKTANTAW %% \end{Verbatim} %% %% \begin{Exercise}\label{ex:railfence:0} %% Program the 3-rail fence cipher in Cryptol. You should write the %% functions: %% \begin{code} %% rail3Fence, dRail3Fence : {a} (fin a) => String((4*a)) -> String ((4*a)); %% \end{code} %% that implements the 3-rails encryption/decryption. Using your %% functions, encrypt and decrypt the message {\tt %% RAILFENCECIPHERISTRICKIERTHANITLOOKS}. %% \end{Exercise} %% \begin{Answer}\ansref{ex:railfence:0} %% \begin{code} %% rail3Fence pt = heads # mids # tails %% where { %% regions = groupBy (4, pt); %% heads = [| r @ 0 || r <- regions |]; %% mids = join [| [(r @ 1) (r @ 3)] || r <- regions |]; %% tails = [| r @ 2 || r <- regions |]; %% }; %% \end{code} %% \end{Answer} %===================================================================== \section{The atbash} \label{sec:atbash} \sectionWithAnswers{The atbash}{sec:atbash} The atbash cipher is a form of a shift cipher, where each letter is replaced by the letter that occupies its mirror image position in the alphabet.\indAtbash That is, {\tt A} is replaced by {\tt Z}, {\tt B} by {\tt Y}, etc. Needless to say the atbash is hardly worthy of cryptographic attention, as it is trivial to break. \begin{Exercise}\label{ex:atbash:0} Program the atbash in Cryptol. What is the code for {\tt ATTACKATDAWN}? \end{Exercise} \begin{Answer}\ansref{ex:atbash:0} Using the reverse index operator, coding atbash is trivial:\indRIndex\indAtbash \begin{code} atbash : {n} String n -> String n atbash pt = [ alph ! (c - 'A') | c <- pt ] where alph = ['A' .. 'Z'] \end{code} We have: \begin{Verbatim} Cryptol> atbash "ATTACKATDAWN" "ZGGZXPZGWZDM" \end{Verbatim} \end{Answer} \begin{Exercise}\label{ex:atbash:1} Program the atbash decryption in Cryptol. Do you have to write any code at all? Break the code {\tt ZGYZHSRHHVOUWVXIBKGRMT}. \end{Exercise} \begin{Answer}\ansref{ex:atbash:1} Notice that decryption for atbash\indAtbash is precisely the same as encryption, the process is entirely the same. So, we do not have to write any code at all, we can simply define: \begin{code} dAtbash : {n} String n -> String n dAtbash = atbash \end{code} We have: \begin{Verbatim} Cryptol> dAtbash "ZGYZHSRHHVOUWVXIBKGRMT" "ATBASHISSELFDECRYPTING" \end{Verbatim} \end{Answer} %===================================================================== \section{Substitution ciphers} \label{section:subst} \sectionWithAnswers{Substitution ciphers}{section:subst} Substitution ciphers\indSubstitutioncipher generalize all the ciphers we have seen so far, by allowing arbitrary substitutions to be made for individual ``components'' of the plaintext~\cite{wiki:substitution}. Note that these components need not be individual characters, but rather can be pairs or even triples of characters that appear consecutively in the text. (The multi-character approach is termed {\em polygraphic}.)\indPolyGraphSubst Furthermore, there are variants utilizing multiple {\em polyalphabetic} mappings,\indPolyAlphSubst as opposed to a single {\em monoalphabetic} mapping\indMonoAlphSubst. We will focus on monoalphabetic simple substitutions, although the other variants are not fundamentally more difficult to implement. \tip{For the exercises in this section we will use a running key repeatedly. To simplify your interaction with Cryptol, put the following definition in your program file:} \begin{code} substKey : String 26 substKey = "FJHWOTYRXMKBPIAZEVNULSGDCQ" \end{code} The intention is that {\tt substKey} maps {\tt A} to {\tt F}, {\tt B} to {\tt J}, {\tt C} to {\tt H}, and so on. \begin{Exercise}\label{ex:subst:0} Implement substitution ciphers in Cryptol. Your function should have the signature: \begin{code} subst : {n} (String 26, String n) -> String n \end{code} where the first element is the key (like {\tt substKey}). What is the code for \\ {\tt "SUBSTITUTIONSSAVETHEDAY"} for the key {\tt substKey}? \end{Exercise} \begin{Answer}\ansref{ex:subst:0} \begin{code} subst (key, pt) = [ key @ (p - 'A') | p <- pt ] \end{code} We have: \begin{Verbatim} Cryptol> subst(substKey, "SUBSTITUTIONSSAVETHEDAY") "NLJNUXULUXAINNFSOUROWFC" \end{Verbatim} \end{Answer} \paragraph*{Decryption} Programming decryption is more subtle. We can no longer use the simple selection operation ({\tt @})\indIndex on the key. Instead, we have to search for the character that maps to the given ciphertext character. \begin{Exercise}\label{ex:subst:1} Write a function {\tt invSubst} with the following signature: %% type Char = [8] // now in prelude.cry \begin{code} invSubst : (String 26, Char) -> Char \end{code} such that it returns the mapped plaintext character. For instance, with {\tt substKey}, {\tt F} should get you {\tt A}, since the key maps {\tt A} to {\tt F}: \begin{Verbatim} Cryptol> invSubst (substKey, 'F') A \end{Verbatim} And similarly for other examples: \begin{Verbatim} Cryptol> invSubst (substKey, 'J') B Cryptol> invSubst (substKey, 'C') Y Cryptol> invSubst (substKey, 'Q') Z \end{Verbatim} One question is what happens if you search for a non-existing character. In this case you can just return {\tt 0}, a non-valid ASCII character, which can be interpreted as {\em not found}. \hint{Use a fold (see Pg.~\pageref{par:fold}).}\indFold \end{Exercise} \begin{Answer}\ansref{ex:subst:1} \begin{code} invSubst (key, c) = candidates ! 0 where candidates = [0] # [ if c == k then a else p | k <- key | a <- ['A' .. 'Z'] | p <- candidates ] \end{code} The comprehension\indComp defining {\tt candidates} uses a fold (see page~\pageref{par:fold}).\indFold The first branch ({\tt k <- key}) walks through all the key elements, the second branch walks through the ordinary alphabet ({\tt a <- ['A' .. 'Z']}), and the final branch walks through the candidate match so far. At the end of the fold, we simply return the final element of {\tt candidates}. Note that we start with {\tt 0} as the first element, so that if no match is found we get a {\tt 0} back. \end{Answer} \begin{Exercise}\label{ex:subst:2} Using {\tt invSubst}, write the decryption function {\tt dSubst}. It should have the exact same signature as {\tt subst}. Decrypt {\tt FUUFHKFUWFGI}, using our running key. \end{Exercise} \begin{Answer}\ansref{ex:subst:2} \begin{code} dSubst: {n} (String 26, String n) -> String n dSubst (key, ct) = [ invSubst (key, c) | c <- ct ] \end{code} We have: \begin{Verbatim} Cryptol> dSubst (substKey, "FUUFHKFUWFGI") "ATTACKATDAWN" \end{Verbatim} \end{Answer} \todo[inline]{This exercise and the true type of \texttt{invSubst} indicate that specs are needed. In other words, we cannot capture \texttt{invSubst}'s tightest type, which would encode the invariant about contents being capital letters, and that lack of expressiveness leaks to \texttt{dSubst}. We really need to either enrich the dependent types or add some kind of support for contracts. The reason this works most of the time is that crypto algorithms work on arbitrary bytes.} \begin{Exercise}\label{ex:subst:3} Try the substitution cipher with the key {\tt AAAABBBBCCCCDDDDEEEEFFFFGG}. Does it still work? What is special about {\tt substKey}? \end{Exercise} \begin{Answer}\ansref{ex:subst:3} No, with this key we cannot decrypt properly: \begin{Verbatim} Cryptol> subst ("AAAABBBBCCCCDDDDEEEEFFFFGG", "HELLOWORLD") "BBCCDFDECA" Cryptol> dSubst ("AAAABBBBCCCCDDDDEEEEFFFFGG", "BBCCDFDECA") "HHLLPXPTLD" \end{Verbatim} This is because the given key maps multiple plaintext letters to the same ciphertext letter. (For instance, it maps all of {\tt A}, {\tt B}, {\tt C}, and {\tt D} to the letter {\tt A}.) For substitution ciphers to work the key should not repeat the elements, providing a 1-to-1 mapping. This property clearly holds for {\tt substKey}. Note that there is no shortage of keys, since for 26 letters we have 26! possible ways to choose keys, which gives us over 4-billion different choices. \end{Answer} %===================================================================== \section{The scytale} \label{sec:scytale} \sectionWithAnswers{The scytale}{sec:scytale} The scytale is one of the oldest cryptographic devices ever, dating back to at least the first century A.D.~\cite{wiki:scytale}.\indScytale Ancient Greeks used a leather strip on which they would write their plaintext\indPlaintext message. The strip would be wrapped around a rod of a certain diameter. Once the strip is completely wound, they would read the text row-by-row, essentially transposing the letters and constructing the ciphertext\indCiphertext. Since the ciphertext is formed by a rearrangement of the plaintext, the scytale is an example of a transposition cipher.\indTranspositioncipher To decrypt, the ciphertext needs to be wrapped around a rod of the same diameter, reversing the process. The cipherkey\indCipherkey is essentially the diameter of the rod used. Needless to say, the scytale does not provide a very strong encryption mechanism. Abstracting away from the actual rod and the leather strip, encryption is essentially writing the message column-by-column in a matrix and reading it row-by-row. Let us illustrate with the message {\tt ATTACKATDAWN}, where we can fit 4 characters per column: \begin{verbatim} ACD TKA TAW ATN \end{verbatim} To encrypt, we read the message row-by-row, obtaining {\tt ACDTKATAWATN}. If the message does not fit properly (i.e., if it has empty spaces in the last column), it can be padded by {\tt Z}'s or some other agreed upon character. To decrypt, we essentially reverse the process, by writing the ciphertext row-by-row and reading it column-by-column. Notice how the scytale's operation is essentially matrix transposition. Therefore, implementing the scytale in Cryptol is merely an application of the {\tt transpose} function.\indTranspose All we need to do is group the message by the correct number of elements using {\tt split}.\indSplit Below, we define the {\tt diameter} to be the number of columns we have. The type synonym {\tt Message} ensures we only deal with strings that properly fit the ``rod,'' by using {\tt r} number of rows:\indJoin \begin{code} scytale : {row, diameter} (fin row, fin diameter) => String (row * diameter) -> String (diameter * row) scytale msg = join (transpose msg') where msg' : [diameter][row][8] msg' = split msg \end{code} The signature\indSignature on {\tt msg'} is revealing: We are taking a string that has {\tt diameter * row} characters in it, and chopping it up so that it has {\tt row} elements, each of which is a string that has {\tt diameter} characters in it. Here is Cryptol in action, encrypting the message {\tt ATTACKATDAWN}: \begin{Verbatim} Cryptol> :set ascii=on Cryptol> scytale "ATTACKATDAWN" "ACDTKATAWATN" \end{Verbatim} Decryption is essentially the same process, except we have to {\tt split} so that we get {\tt diameter} elements out:\indSplit\indJoin\indScytale \begin{code} dScytale : {row, diameter} (fin row, fin diameter) => String (row * diameter) -> String (diameter * row) dScytale msg = join (transpose msg') where msg' : [row][diameter][8] msg' = split msg \end{code} Again, the type on {\tt msg'} tells Cryptol that we now want {\tt diameter} strings, each of which is {\tt row} long. It is important to notice that the definitions of {\tt scytale} and {\tt dScytale} are precisely the same, except for the signature on {\tt msg'}! When viewed as a matrix, the types precisely tell which transposition we want at each step. We have: \begin{Verbatim} Cryptol> dScytale "ACDTKATAWATN" "ATTACKATDAWN" \end{Verbatim} \begin{Exercise}\label{ex:scytale:0} What happens if you comment out the signature for {\tt msg'} in the definition of {\tt scytale}? Why?\indScytale \end{Exercise} \begin{Answer}\ansref{ex:scytale:0} If you do not provide a signature for {\tt msg'}, you will get the following type-error message from Cryptol: \begin{small} \begin{Verbatim} Failed to validate user-specified signature. In the definition of 'scytale', at classic.cry:40:1--40:8: for any type row, diameter fin row fin diameter => fin ?b arising from use of expression split at classic.cry:42:17--42:22 fin ?d arising from use of expression join at classic.cry:40:15--40:19 row * diameter == ?a * ?b arising from matching types at classic.cry:1:1--1:1 \end{Verbatim} \end{small} Essentially, Cryptol is complaining that it was asked to do a {\tt split}\indSplit and it figured that the constraint $\text{\emph{diameter}}*\text{\emph{row}}=a*b$ must hold, but that is not sufficient to determine what {\tt a} and {\tt b} should really be. (There could be multiple ways to assign {\tt a} and {\tt b} to satisfy that requirement, for instance {\tt a=4}, {\tt b=row}; or {\tt a=2} and {\tt b=2*row}, resulting in differing behavior.) This is why it is unable to ``validate the user-specified signature''. By putting the explicit signature for {\tt msg'}, we are giving Cryptol more information to resolve the ambiguity. Notice that since the code for {\tt scytale} and {\tt dScytale} are precisely the same except for the type on {\tt msg'}. This is a clear indication that the type signature plays an essential role here.\indAmbiguity\indSignature \end{Answer} \begin{Exercise}\label{ex:scytale:1} How would you attack a scytale encryption, if you don't know what the diameter is? \end{Exercise} \begin{Answer}\ansref{ex:scytale:1} Even if we do not know the diameter, we do know that it is a divisor of the length of the message. For any given message size, we can compute the number of divisors of the size and try decryption until we find a meaningful plaintext. Of course, the number of potential divisors will be large for large messages, but the practicality of scytale stems from the choice of relatively small diameters, hence the search would not take too long. (With large diameters, the ancient Greeks would have to carry around very thick rods, which would not be very practical in a battle scenario!)\indScytale \end{Answer} %%% Local Variables: %%% mode: latex %%% TeX-master: "../main/Cryptol" %%% End: