Page 1 of 1

encoding and decoding randomly

Posted: Fri Nov 06, 2015 8:09 am
by lispstudent
There is a file which includes words from beginning of 26 letter english alphabet to end about 16000 or more Like *words.lisp* of content,

Code: Select all

    (defparameter *words* '(
    					(a b a c a)
    					(a b a c a s)
                        (a b a c k)
                        .
                        .
                        .
                        .
                        (z y m o t i c)
                        (z y m u r g y))
As I said the alphabet file also there is. Like, *alph.lisp* of content

Code: Select all

`(defparameter *alphabet* '(a b c d e f g h i j k l m n o p q r s t u v w x y z))`
I have been trying to write a function that finds the encoder of a given encrypted paragraph using the words file. As you can see, words are represented as atom lists and paragraphs are represented as list of lists (word lists). Alphabet is defined as atom list. Encrypted paragraph will be the argument of the function and as known the words and alphabet. Encoder is a one-to-one, randomly shuffled alphabet

Code: Select all

    Original Paragraph:
    ((A N X I O U S) ( H O M E ) ( N O I S E ))
    Encrypted Paragraph:
    ((G P N Y A X Q) ( J A B L ) ( P A Y Q L ))
    
    Alphabet: (A B C D E F G H I J K L M N O P Q R S T U V W X Y Z)
    
    I try to find the encrypter 
    
    Encrypter  : (G Z O C L H T J Y M E R B P A F W K Q I X U S N V D)
I want advices what should I do? How can I start?

Code: Select all

    Example input
    (find-encrypter '((G P N Y A X Q) ( J A B L ) ( P A Y Q L )))
    Example output:
    (G Z O C L H T J Y M E R B P A F W K Q I X U S N V D)
Could you guide?

Re: encoding and decoding randomly

Posted: Fri Nov 06, 2015 1:43 pm
by David Mullen
How do they define "the" right encoding? Seems like there could be (at least for some messages) more than one potential encoding.

Re: encoding and decoding randomly

Posted: Fri Nov 06, 2015 10:37 pm
by lispstudent
David Mullen wrote:How do they define "the" right encoding? Seems like there could be (at least for some messages) more than one potential encoding.
of course you are right there are I have just written three to be an example but actually around 15 encoded words will be entered as input to put out the encoder

Re: encoding and decoding randomly

Posted: Sat Nov 07, 2015 4:18 pm
by David Mullen
Well, I don't know how brute-force this is allowed to be, but brute force is all that comes to mind right now. My idea was to produce random alphabets until some of the words clicked with the dictionary, then recursively produce more alphabets while holding those letters—the ones so far matched—in fixed positions, and repeat until a viable alphabet is generated for the full sentence. Here goes:

Code: Select all

(defvar *mutated-alphabet*
  *alphabet*)

(defun letter-in-sentence-p (letter sentence)
  (loop for word in sentence thereis (member letter word)))

(defun make-mutated-alphabet (fixed-words original-alphabet)
  (loop with mutated-alphabet = (copy-list original-alphabet)
        repeat (+ 100 (random 1000)) finally (return mutated-alphabet)
        do (let* ((i (random 26)) (letter1 (nth i mutated-alphabet))
                  (j (random 26)) (letter2 (nth j mutated-alphabet)))
             (unless (letter-in-sentence-p letter1 fixed-words)
               (unless (letter-in-sentence-p letter2 fixed-words)
                 (setf (nth i mutated-alphabet) letter2)
                 (setf (nth j mutated-alphabet) letter1))))))

(defun decrypt-word (word &optional (alphabet *mutated-alphabet*))
  (loop for letter in word as position = (position letter alphabet)
        collect (nth position *alphabet*) into decrypted-word
        finally (return (find decrypted-word *words* :test #'equal))))

(defun decrypt (sentence &optional fixed-words (iterations 5000000))
  (loop with sentence-length of-type fixnum = (length sentence)
        for iteration of-type fixnum from 0 below iterations
        do (let* ((*mutated-alphabet*
                   (make-mutated-alphabet fixed-words *mutated-alphabet*))
                  (decrypted-sentence (mapcar #'decrypt-word sentence))
                  (failure-count (count nil decrypted-sentence)))
             (declare (type fixnum failure-count))
             (when (zerop failure-count)
               (return (values decrypted-sentence *mutated-alphabet* iteration)))
             (when (> (the fixnum (- sentence-length failure-count)) (length fixed-words))
               (loop for n = 0 then (1+ n) for decrypted-word in decrypted-sentence
                     when decrypted-word collect (nth n sentence) into recursive-fixed-words
                     finally (multiple-value-bind (decrypted-sentence mutated-alphabet)
                                 (decrypt sentence recursive-fixed-words 5000)
                               (when decrypted-sentence
                                 (return-from decrypt
                                   (values decrypted-sentence
                                           mutated-alphabet
                                           iteration)))))))))
Running this gave me:

Code: Select all

? (decrypt '((G P N Y A X Q) ( J A B L ) ( P A Y Q L )))
((A N X I O U S) (H O M E) (N O I S E))
(G T W R L C U J Y K S F B P A H O E Q I X Z V N D M)
2141108
So it took over two million iterations (and several minutes) to produce the "encrypter" alphabet of (G T W R L C U J Y K S F B P A H O E Q I X Z V N D M). I don't have your dictionary, so I don't really know whether it would work better with a longer message or not. Maybe the only way to speed it up is to do frequency analysis, or break down the words into smaller chunks of letters, or whatever it is that the experts do to crack these kinds of ciphers.