2012-06-15

SICP Exercise 2.92: Dealing With Different Indeterminates - The Naïve Approach

By imposing an ordering on variables, extend the polynomial package so that addition and multiplication of polynomials works for polynomials in different variables. (This is not easy!)
I'm going to try to tackle this in a couple of ways. In this post I'm going to take a naïve approach, which will allow us to explore some portions of a solution. I hope to follow this up with a full "this is not easy!" approach.

The Naïve Approach

The naïve approach of ensuring "that both polynomials have the same principal variable" comes from the observation that a polynomial of one indeterminate can be expressed as a polynomial of another indeterminate by simply treating the former polynomial as the coefficient of the zero-order term of the latter polynomial.

In case that's not clear, here's an example of what I mean. I'll explicitly give the indeterminate and order of each term, including the order 1 and order 0 terms so it's easier to see what's been done. We can express a polynomial in y of the form:
Cnyn + Cn-1yn-1 + Cn-2yn-2 + … + C2y2 + C1y1 + C0y0
as a polynomial in x as follows:
(Cnyn + Cn-1yn-1 + Cn-2yn-2 + … + C2y2 + C1y1 + C0y0)x0
Why is this naïve? Well it won't cope particularly well when performing an arithmetic operations on a pair of polynomials where the first polynomial is a polynomial in one variable whose coefficients are polynomials in a second variable and the second polynomial is a polynomial in the second variable whose coefficients are polynomials in the first variable. As a concrete example of this, let's use this approach to subtract a polynomial in y whose coefficients are polynomials in x from a polynomial in x whose coefficients are polynomials in y:
  ((5y2 + 2y - 1)x2 + (2y2 + y + 2)x - 3) - ((5x2 + 2x)y2 + (2x2 + x)y - (x2 - 2x + 5))
= ((5y2 + 2y - 1)x2 + (2y2 + y + 2)x - ((5x2 + 2x)y2 + (2x2 + x)y - (x2 - 2x - 2))
...and that's as far as it goes.

To see why this is bad, let's see what happens if we "convert one polynomial to the type of the other by expanding and rearranging terms" and then perform the calculation. First, let's convert the second polynomial, which a polynomial in y with coefficients that are polynomials in x, to express it as a polynomial in x with coefficients that are polynomials in y:
  (5x2 + 2x)y2 + (2x2 + x)y - (x2 - 2x + 5)
= 5x2y2 + 2xy2 + 2x2y + xy - x2 + 2x - 5
= 5y2x2 + 2y2x + 2yx2 + yx - x2 + 2x - 5
= 5y2x2 + 2yx2 - x2 + 2y2x + yx + 2x - 5
= (5y2 + 2y - 1)x2 + (2y2 + y + 2)x - 5
Now we substitute the converted polynomial into the calculation:
  ((5y2 + 2y - 1)x2 + (2y2 + y + 2)x - 3) - ((5x2 + 2x)y2 + (2x2 + x)y - (x2 - 2x + 5))
= ((5y2 + 2y - 1)x2 + (2y2 + y + 2)x - 3) - ((5y2 + 2y - 1)x2 + (2y2 + y + 2)x - 5)
= (5y2 + 2y - 1)x2 + (2y2 + y + 2)x - 3 - (5y2 + 2y - 1)x2 - (2y2 + y + 2)x + 5
= (5y2 + 2y - 1)x2 - (5y2 + 2y - 1)x2 + (2y2 + y + 2)x - (2y2 + y + 2)x - 3 + 5
= 5 - 3
= 2
So, as we can see, the naïve approach will not simplify the results of arithmetic operations. However, as I said above, we'll not try to do anything more complex in this post than pursing the naïve approach.

Selecting a Variable

Regardless of approach, we need to be able to select an appropriate "principal variable" to express polynomials in before we can perform any conversion. The authors suggest that one way of achieving this is to "impose a towerlike structure on this by ordering the variables and thus always converting any polynomial to a 'canonical form' with the highest-priority variable dominant and the lower-priority variables buried in the coefficients." We can select any ordering for this, so long as it is predictable. Let's just use alphanumeric ordering.

You'll have noted that we're using Scheme symbols to represent our variables. Symbols have no concept of ordering associated with them... but strings do. We can convert symbols that represent variables into strings using the standard procedure symbol->string, and then test which should come first using the procedure string<=?. Of course, we'll need to remember the 'unbound variable we introduced in exercise 2.90. We should avoid selecting this as our principal if possible!

Programmatically we can express the selection as:
(define (select-principal-variable v1 v2)
  (cond ((eq? v1 'unbound) v2)
        ((eq? v2 'unbound) v1)
        (else (let ((s1 (symbol->string v1))
                    (s2 (symbol->string v2)))
                (if (string<=? s1 s2)
                    v1
                    v2)))))

Coercing to the Principal Variable

Given that we've determined the principal variable, the naïve approach makes coercing a polynomial so that it's expressed in terms of the principal variable is very straightforward. We simply check whether or not the polynomial is already expressed in terms of the principal variable and, if it isn't, we form a new polynomial in the principal variable with the original polynomial as the coefficient of the zero-order term.

Note that we can also bind 'unbound polynomials at this point. Doing so stops us from unnecessarily introducing an additional level of nesting to the coerced polynomial.

Here it is in code:
(define (express-in principal-variable p)
  (cond ((eq? principal-variable (variable p)) p)
        ((eq? 'unbound (variable p)) (make-poly principal-variable (term-list p)))
        (else (make-from-coeffs principal-variable (list (tag p))))))
Obviously this procedure lives within the polynomial package so that we can access the variable and term-list of the polynomial p. One implication of this is that p is untagged, hence the need to re-tag it before wrapping it in a list and passing it through to make-from-coeffs when we need to turn it into the coefficient of a zero-order term.

Making Use of the Coercion

Now all we need to do is to wire this in to the existing arithmetic operations. Currently add-poly, mul-poly and div-poly all test whether or not the polynomials passed to them are in the same variable and, if not, raise an error. sub-poly also applies this test, but does do via add-poly. We can remove this test entirely if we perform the coercion in the λ-expressions that are installed in the table as, by doing this change, we will guarantee that the polynomials passed to the procedures are in the same variable.

The general form of the change is to find the principal variable, coerce all of the polynomials so that they're expressed in terms of this variable and then invoke the appropriate procedure with these coerced polynomials. Rather than repeating this change in each of the λ-expressions we can note that these operations are all binary operations and extract the common functionality into a single procedure that takes the two polynomials and the procedure to apply and performs these steps:
(define (coerce-and-call p1 p2 op)
  (let* ((principal (select-principal-variable (variable p1) (variable p2)))
         (new-p1 (express-in principal p1))
         (new-p2 (express-in principal p2)))
    (op new-p1 new-p2)))
We can then modify the λ-expressions so that they use coerce-and-call rather than invoking the arithmetic operation procedures directly:
(put 'add '(polynomial polynomial) 
     (lambda (p1 p2) (tag (coerce-and-call p1 p2 add-poly))))
(put 'mul '(polynomial polynomial) 
     (lambda (p1 p2) (tag (coerce-and-call p1 p2 mul-poly))))
(put 'div '(polynomial polynomial) 
     (lambda (p1 p2)
       (let ((result (coerce-and-call p1 p2 div-poly)))
         (list (drop (tag (car result)))
               (drop (tag (cadr result)))))))
(put 'sub '(polynomial polynomial)
     (lambda (p1 p2) (tag (coerce-and-call p1 p2 sub-poly))))
Finally we can remove the same-variable? tests from the three arithmetic operation procedures:
;; procedures used by add-poly
(define (add-poly p1 p2)
  (make-poly (select-variable p1 p2)
             (add (term-list p1)
                  (term-list p2))))

;; procedures used by mul-poly
(define (mul-poly p1 p2)
  (make-poly (select-variable p1 p2)
             (mul (term-list p1)
                  (term-list p2))))

;; procedures used by div-poly
(define (div-poly p1 p2)
  (let ((variable (select-variable p1 p2))
        (result (div (term-list p1) (term-list p2))))
    (list (make-poly variable (car result))
          (make-poly variable (cadr result)))))

A Quick Spin

Okay, let's see this in action by trying out out the subtraction example we listed at the top of the exercise:
> (define poly-1
    (make-polynomial-from-coeffs
     'x
     (list (make-polynomial-from-coeffs
            'y
            (list (make-integer 5) (make-integer 2) (make-integer -1)))
           (make-polynomial-from-coeffs
            'y
            (list (make-integer 2) (make-integer 1) (make-integer 2)))
           (make-integer -3))))
> (define poly-2
    (make-polynomial-from-coeffs
     'y
     (list (make-polynomial-from-coeffs
            'x
            (list (make-integer 5) (make-integer 2) zero))
           (make-polynomial-from-coeffs
            'x
            (list (make-integer 2) (make-integer 1) zero))
           (make-polynomial-from-coeffs
            'x
            (list (make-integer -1) (make-integer 2) (make-integer -5))))))
> (sub poly-1 poly-2)
(polynomial x
            dense-terms
            (polynomial y
                        dense-terms
                        (integer . 5)
                        (integer . 2)
                        (integer . -1))
            (polynomial y
                        dense-terms
                        (integer . 2)
                        (integer . 1)
                        (integer . 2))
            (polynomial y
                        dense-terms
                        (polynomial x
                                    sparse-terms
                                    (term 2 (integer . -5))
                                    (term 1 (integer . -2)))
                        (polynomial x
                                    sparse-terms
                                    (term 2 (integer . -2))
                                    (term 1 (integer . -1)))
                        (polynomial x
                                    dense-terms
                                    (integer . 1)
                                    (integer . -2)
                                    (integer . 2))))
Right, now onto the full "this is not easy!" approach!

No comments:

Post a Comment