We recommend that the reader be familiar with our book A Numerical Approach to Real Algebraic Curves with the Wolfram Language, henceforth known as “my Plane Curve book”, or at least with the Mathematica Journal summary of this book (2018). And the reader should have some familiarity with the Wolfram Language.
Note the naming conventions: All global functions defined in this Space Curve Book begin with a lowercase letter, compound names will capitalize first letters of subsequent words, (camel casing). This avoids confusion with built in Mathematica functions. Also functions with polynomial and/or point arguments will end in 2D, 3D, or MD depending on whether they work in 2,3 or all dimensions. This makes clear what the arguments are and distinguishes these functions from my Plane Curve functions so both sets can be initialized together without conflict, however most plane curve functions that you may need are contained here with 2D designation. Note that functions with suffix 3D or MD take variables as a list, but members of the list should be atomic variables, e.g. not X[[2]] but possibly x[2].
Disclaimer
The author makes no representations, express or implied, with respect to this documentation or software it describes, including, without limitation, any implied warranties of merchantability, interoperability or fitness for a particular purpose, all of which are expressly disclaimed. Use of Mathematica and other related software is subject to the terms and conditions as described at www.wolfram.com/legal . In addition to the forgoing, users should recognize that all complex software systems and their documentation contain errors and omissions. Barry H. Dayton and Wolfram Research a) shall not be responsible under any circumstances for providing information or corrections to errors and omissions discovered at any time in this book or software; b) shall not be liable for damages of any kind arising out of the use of (or inability to use) this book or software; c) do not recommend the use of the software for applications in which errors or omissions could threaten life, or cause injury or significant loss.
Mathematica and Wolfram Language are trademarks of Wolfram Research Inc.
we don’t have a canonical direction of travel on curves, unlike
2
.
Therefore tracing in
3
is somewhat different. This tracing function takes what appears to be the shortest Euclidean distance to the end point. If the intended path does not go directly to the desired end the trace may fail, so one should trace short or relatively straight paths only. Also replacing the order of
{f,g}
or their signs makes no difference. In particular tracing around a closed bounded component requires at least 3 paths. Finally, in the unlikely event of a singular point then you can trace into this point, but not out. By default the procedure will stop after 30 steps, this can be changed to a different number
n
by the option maxit→n. If the maximum number of iterations is reached, the path will jump to the indicated end point as in the plane case.
This shows that our curve will be closed and bounded, in principle we can have a path from any critical point to any other. But applying pathFinder3D to get from critical point 4 to critical point 6 we get
However the main technique in this book is to find the equation of a space curve after projection to the plane. In Chapter 2 we will learn how to do this algebraically from the equations but for now we can simply project a sufficient number of sufficiently random points and reconstruct an equation interpolating by my plane interpolation function acurve. Here it is as a 2D function in our Space Curve global functions:
One difficult issue with space curves is calculating the degree of a projection. This depends on both the equations and the projection matrix. But generically in the case of a naive curve given by equations of degrees
d
1
,
d
2
the degree of a reasonably random plane projection is
d
1
*
d
2
.
For Example 1.1.1 both equations are quadratics so the degree of the curve is 4. By interpolation we need
6*5/2-1=14
points. It turns out, relative to these specific equations that Pxy is sufficiently random. We can get 14 points easily from our projected path tracing.
In[]:=
pts2=RandomSample[pth,14];g=aCurve2D[pts2,x,y]
Out[]=
3.35997-6.01438×
-13
10
x-0.839992
2
x
+3.99152×
-13
10
3
x
-0.839992
4
x
-4.91802×
-13
10
y-5.67127×
-13
10
xy+4.48573×
-13
10
2
x
y+1.99789×
-13
10
3
x
y-0.839992
2
y
+3.16171×
-13
10
x
2
y
+1.67998
2
x
2
y
+1.96307×
-13
10
3
y
+3.63534×
-13
10
x
3
y
-0.839992
4
y
By symmetry we don’t expect terms with odd degrees in either variable so we can chop small coefficients.
In[]:=
pf1=Chop[g,1.*^-9]
Out[]=
3.35997-0.839992
2
x
-0.839992
4
x
-0.839992
2
y
+1.67998
2
x
2
y
-0.839992
4
y
In fact, this looks like an exact polynomial, so divide by the smallest coefficient
In fact, the actual point projection is only part of a circle! This is an important lesson, the point projection of an algebraic space curve will lie in an algebraic curve but may not be the entire curve. The smallest algebraic curve containing the point projection is known to algebraic geometers as the Zariski Closure of the projection.
So this is why it is important to use generic, that is, random projections. Sometimes it is useful, for replication, to have only a pseudo-random projection that we will use over and over. The one I have chosen is known as prd3D and given by
This brings up another issue. When curves are projected the projection may have singular points even though the original curve did not have a singular point or at least not one that projects to this singularity. I will call such points, non-standardly, artifactual. In fact, for many curves, including this one, generic projections must include artifactual points, although very possibly complex or infinite. We will discuss this at the end of this book when considering genus. In addition to ordinary crossings these artifactual singularities may be cusps or isolated points.
For an example of an artifactual cusp we introduce the famous twisted cubic to be discussed at the beginning of Chapter 2. This is a curve generally given parametrically as
t↦{t,
2
t
,
3
t
}.
As we will explain in Chapter 3 such curves are algebraic, although even in
3
not necessarily naive. In fact this curve is the poster child for non-naive curves but is contained in the naive curve
In[]:=
F2={y-x^2,z-xy};
where the extra component lies in the infinite plane so won’t influence this discussion. If we project to the plane with Pyz which sends the first component to 0 then from the parametric expression we get the parametric plane curve
t↦{
2
t
,
3
t
}
which we recognize as a cusp. Or we can easily describe a set of points plotting the curve
As for the possibility of the projection having isolated artifactual singular points the easiest example is projecting the z-axis, that is the naive space curve
{x=0,y=0}
with Pxy.
One can certainly find non-singular curves and projections giving more complicated artifactual singularities. For example see the section on blowing-up in Chapter 3 to see how to make any plane singularity artifactual. But for this to happen with a truly generic projection generated independently from the curve is very unlikely.
1.2.1 Nice Example: Viviani Curve
The Viviani Curve [see https://www.wolframalpha.com/input/?i=Viviani+Curve] gives a nice example of a singular space curve which looks very different depending on the projection. The curve, often seen as a parametric curve, is given implicitly by
In[]:=
v1=x^2+y^2+z^2-4;v2=(x-1)^2+y^2-1;V={v1,v2}
Out[]=
{-4+
2
x
+
2
y
+
2
z
,-1+
2
(-1+x)
+
2
y
}
One can use either method of 1.1 or 1.2, or a parameterization, to draw the curve:
Note that the point where the branches seem to cross is actually the singular point
{2,0,0}
where they do cross
In[]:=
tangentVector3D[V,{2,0,0},{x,y,z}]
»
Notangentvectorat{2,0,0}
Out[]=
{0.,0.,0.}
Using projections the best is, as usual our pseudo-random prd3D or FLT version fprd3d which gives a 4th degree plane curve.
In[]:=
vd2=FLTMD[V,fprd3D,4,{x,y,z},{x,y},dTol][[1]]
Out[]=
1.+4.30229x+3.68817
2
x
+0.024428
3
x
+0.000444366
4
x
-2.00048y+0.312204xy+1.0116
2
x
y-3.77986
2
y
-1.05115x
2
y
+0.0400199
2
x
2
y
+0.511479
3
y
+0.901056
4
y
This curve can be drawn in color giving 4 segments where the center red point is the image of the singularity, the other 2 are 2D critical points.
Projecting on the x,y plane using the projection fCompProj[3,3] gives the circle
where again the red point is the image of the singular point. Each other point of the circle has a 2 point fiber. This is an example of how a non-generic projection can take a singular point to a non-singular point.
Here we get a parabola. But the bounded Viviani curve can’t linearly project on the unbounded parabola. In fact the image
lies in the range
0≤x≤2
where each point image other than the end points has a two point fiber. As one starts at the singularity of the Viviani curve and goes around a loop the projection starts at
{2,0}
goes out one colored branch of the parabola and back on the same branch to
{2,0}
. This is a good example of where a space curve may not map onto the FLT projection curve, particularly in the case of a non-random projection.
The non-random projection on the y-z plane does act somewhat like the random prd3d projection giving a 4th degree curve.
The red point is the image of the singular point of the Viviani curve and the other 2 singularities are artifactual singularities from the projection. This is expected since the Viviani curve having a rational parameterization means it has genus 0 so we expect, generically 3 singular points in the projection. Actually the fprd3D[2,3] and non-random projection on the y-z plane have isolated singularities, the former a double singularity at the infinite point {1,0,0} and the later two real plane isolated singularities. So all the projections remain rational curves.
In the plane curve book I defined Fractional Linear Transformations in Chapter 6 and use them heavily there. On the point level these are given in the Wolfram Language under the name TransformationFunction. My abbreviation for TransformationFunction was flt. Since TransformationFunction works in all dimensions this appears here as fltMD[p,A]. Note neither the curve we are working with nor the variables matter so we need to know only the point
p
and the transformation matrix
A
which needs to be neither square nor invertible. However, in the affine case the number of columns needs to be 1 more than the length of
p
and the length of the output will be one less than the number of rows. That is, a
(n+1)×(m+1)
transformation matrix takes a point in
m
to a point in
n
.
If
p
is an infinite point then TransformationFunction should be replaced by either matrix multiplication
A.p
or fltiMD[p,A]. Then a
(n+1)×(m
+1) transformation matrix takes projective
n
to projective
m
. Actually fltiMD[p,A] will accept either an affine point of length
m
or an infinite point of length
m+1
and if the result is not an infinite point it will be represented as an affine point of length
n.
However an important observation was that invertible Transformation Matrices actually take curves to curves on the equation level. In the plane case this was simple as each curve is given by a single bivariate polynomial. This plane case is represented here by FLT2D[f,A,x,y]. This is easily extended to the naive case and given by FLT3D[F,A,X].
Although we will keep the name FLT3D to distinguish this version from the much more complicated general FLTMD, the main workhorse and contribution of this book, we note that FLT3D actually works in all dimensions and for systems of any number of equations as long as the transformation matrix is square and invertible. Unlike the more general FLTMD this works separately on each equation so the number of equations returned is the same as the number entered.
The Wolfram Language has many transformation matrices, see, for example the examples under Transformation Matrices in nD in the help page Geometric Transforms. In addition see the symbolic transformation functions, example
In[]:=
TranslationTransform[{3,-3,2}]
Out[]=
TransformationFunction
1
0
0
3
0
1
0
-3
0
0
1
2
0
0
0
1
so to translate the curve given by
In[]:=
F={z-
2
x
-
2
y
,x+y+z};
use
In[]:=
FLT3DF,
1
0
0
3
0
1
0
-3
0
0
1
2
0
0
0
1
,{x,y,z}
Out[]=
{-20+6x-
2
x
-6y-
2
y
+z,-2+x+y+z}
Wolfram also has rotations, reflections and scaling (homothety) transforms in
n
dimensions.
In addition we import klRotation2D and ip2z2D from our plane curve book (code in GlobalFunctions.nb) but note that the latter does not need the dummy variables
x,y
entered, the syntax is simply ip2z2D[ip] where ip is the infinite point.
For 3 dimensions we have a generalization of klRotation2D , uvRotationMD[u,v] which takes the vector
u
and rotates it about the origin until it is in the direction of
v
and a transformation matrix ip2z3D[ip] which takes the infinite point ip and places it at the origin.
The standard example of a curve requiring more than
n-1
equations is the twisted cubic.
In[]:=
twCubic={xz-y^2,y-x^2,z-xy};
The claim is that no two of these equations describe this curve, all 3 are needed. In fact the naive curve defined by any two contains a line in addition to the curve. Later, in section 3.2 we will learn how to analyze these curves defined by two quadratics known as QSIC. For now we use a trick. In
3
given a line and a plane they need not intersect but usually will, the exception is when the line is parallel to the plane. But if the line is given then a random choice of plane will intersect that line with probability very near 1. Now if we intersect the twisted cubic defined by all three equations with a random plane we get 3 points, possibly 2 are complex.
Thus it is enough to show that intersecting the QSIC defined by two of the three equations actually gives 4 intersection points, meaning the QSIC must have an extra component. This works fine in the first two cases
The reason is that the extra line is contained in the infinite plane! So we use the trick from Chapter 6 of my plane curve book, we bring most of the line back into the affine plane by ip2z3D[{0,0,1,0}]
So this set has infinitely many regular points but also infinitely many non-regular points, so it is not a curve so even though it is given by 2 equations in 3 unknowns it is not a curve.
Example 2.2.3 Cyclic 4
Here is well studied curve in
4
,
the Cyclic 4. We will examine this further later on in this Chapter.
Thus this curve has 8 complex singular points. It can be shown that all solutions are of this form and only these 8 are singular so the cyclic 4 is a curve.
A strange fact, discussed later in this chapter, is that the parametric curves
{r,1/r,-r,-1/r},{r,-1/r,-r,1/r}
comprising the solution set of C4 are non-singular as parametric curves. So singularity is based on the equation system rather than the geometry of the point set. We have actually seen this before with plane curves
x+y=0
is non-singular but the same solution set is given by
2
(x+y)
=0
where every point is singular.
One other general algorithm we can give at this point is a general critical point finder. It does not work as well as criticalPoint3D for naive curves but it will return some critical points using standard optimization techniques. It does not give isolated points but it may identify possibly singular points by repeated solutions. This method was suggested by the paper by [Wang, Bican] but since the methods are not specifically related to this paper we just give the code. The objective function is random so you might run this several times.
2.3.1 Construction of Macaulay and Sylvester Matrices.
As mentioned above these matrices can be large, therefore it is important to have efficient methods for constructing these. Fortunately Mathematica has adequate data manipulation methods which allow us to do that.
I generally use
m
to represent the order of a Macaulay or Sylvester Matrix, this is the largest total degree of a monomial to be considered. The columns of both types represent the monomials in given variables up to total degree
m.
One can get a list of the monomials in the ordering used by the routine mExpsMD. For example the columns of order 3 with 3 variables
{x,y,z}
correspond to the following list.
In[]:=
mExpsMD[3,{x,y,z}]
Out[]=
{1,x,y,z,
2
x
,xy,xz,
2
y
,yz,
2
z
,
3
x
,
2
x
y,
2
x
z,x
2
y
,xyz,x
2
z
,
3
y
,
2
y
z,y
2
z
,
3
z
}
For Sylvester matrices the rows represent the coefficients of monomials in this list of the multivariate polynomials defining a curve, or other algebraic set, together with multiples of these polynomials by monomials of degree small enough that the product is of degree less than or equal to
m.
For Macaulay matrices we apply a change of variables sending the given point to the origin and then allow multiplication by all monomials of order less than
m
but then truncating the result by dropping all terms of total degree greater than
m.
If this truncating results in the zero row we do not add this row to the Macaulay matrix.
Note that if
m
is smaller than the largest total degree of the equations the Sylvester Matrix would be empty or will ignore some equations, so our routine sylvesterMD will refuse a result returning only an error message. On the other hand, macaulayMD will return a result for any
m≥1.
As you will notice in the applications we generally use small orders for the Macaulay matrix but need larger orders for the Sylvester matrix. Although for the same
m
the Macaulay matrix will have far more rows than the Sylvester matrix it is misleading to say Macaulay matrices are larger since we use smaller orders for the Macaulay matrices than the degrees of the equations.
In the rest this subsection I will explain the details of the construction, the reader uninterested in these will skip to the next subsection.
Rather than using the actual variables, since only coefficients appear in these matrices we use integer lists corresponding to the exponents of the monomials, so, for example, if our variables are given as
{x,y,z,w}
then instead of the monomial
2
x
z
3
w
we will use {2,0,1,3}. Note that our variables are used in the order indicated in the last argument
X.
Here
n
is the number of variables.
Our first task is to create the list of possible exponents. We do this one degree at a time with a recursive routine, essentially getting homogeneous monomials hence the “h”.
Sylvester matrices will play a large role below. For those readers familiar with contemporary algebraic geometry they essentially replace the concept of ideal. So we give only two simple applications here which are multivariate extensions procedures in Appendix 1 of our plane curve book.
2.3.2.1 Numerical Division of multivariate polynomials.
Given polynomials
f,g
of degrees
d
1
,
d
2
in variables
X
we note that we can use sylMD to do matrix multiplication with the formulas
sylMD[f*g,
d
1
+
d
2
,X]=syl[f,
d
1
,X].syl[g,
d
1
+
d
2
,X]h=f*g=sylMD[h,
d
1
+
d
2
,X].mExpsMD[
d
1
+
d
2
,X]
Of course this will be about 100 times slower than the built - in Expand[f*g] but it gives us a suggestion for undoing this multiplication : recover f by multiplying
sylMD[h,
d
1
+
d
2
,X]
on the right by
-1
syl[g,
d
1
+
d
2
,X]
. Of course this rarely would be an invertible matrix but we can use the pseudo-inverse instead. This gives us the procedure, using the faster FromCoefficientRules instead of multiplying by mExps:
and as formulated there is no stopping criterion in this method to conclude that
g
is not a polynomial combination of the
f
i
.
In the next section 2.4 we will see that there is a defect in the curve system {
f
1
,
f
2
}
, we should have
g
and one more polynomial in our system and then the first try will be definitive.
2.3.3 Applications of Macaulay Matrices
2.3.3.1 Intersection Multiplicity
The main application of Macaulay matrices is Macaulay’s original application, the computation of intersection multiplicity. For this application we will have a system of
n
or more equations in
n
variables,
n≥2,
and an isolated solution. In our case isolated means a solution point
p
such that there is no other solution point
q
such that Norm[p-q]<ϵ for some
ϵ>0.
In particular
p
will not be a regular point of a curve.
A full description of this algorithm is given in [Dayton-Li-Zeng] where we emphasize that multiplicity is not just a number. This was known but not well known previously. We describe this concept with a sequence called, historically, the
HilbertFunction
although the reader is forewarned that there are other different sequences with this name in the literature and even in this book where in addition to this local Hilbert Function there is a global Hilbert Function. Essentially this measures the change in dimension of the null space of the Macaulay matrix of order
m
as
m
increases. The first term (
m=0)
of the Hilbert Function should be 1 indicating that point
p
is a zero of our system. The second term (
m=1)
we call the breadth which is also known as the embedding dimension by algebraic geometers, it is always less than
n
. The fact that
p
is isolated implies that the numbers in this Hilbert Function become zero at some point, once this happens it will continue to happen if we calculated further. The order of the last non-zero number in the Hilbert function we call the depth which should not be confused with Macaulay’s notion of depth. Finally the sum of all non-zero numbers in the Hilbert Function is simply called the intersection multiplicity, or just multiplicity.
As mentioned above in section 2.3.1 the Macaulay matrix for large
m,n
can be very large already when
m
or
n
is greater than 3. Thus calculating null spaces can be time consuming. Therefore I give several algorithms for multiplicity.
The first is our original which requires the user to guess an upper bound for the depth. It is the only version that gives the Hilbert Function. Usually this is a small number so the calculation will be quick. If the depth turns out to be large this version stops before termination so as not to force the user to wait. On the other hand this version does not halt at the first occurrence of 0 in the Hilbert function. In order to make this as fast as possible we calculate only the final Macaulay Matrix and deduce the Hilbert function from that.
We need two subroutines. The first, nrref is simply a numerical version of the reduced row echelon form, the code, in GlobalFunctions.nb will be discussed in the next subsection. This nrref does return a sequence which allows computation of the Hilbert function. Here is that algorithm.
is the set of variables and tol is a desired tolerance. For numerical systems, in particular, this can make a difference, for example a very loose tolerance can pick up nearby isolated points. But the reader should be aware that, for high depth, computation of intersection points accurately is a problem, see our paper [Dayton-Li-Zeng]. A loose tolerance can make up for an inaccurately calculated intersection point. Note the built-in function Timing returns the time of execution and the value. If the last entry of the Hilbert function is not 0 a warning message is given.
Example 2.3.3.1.1: Consider the two variable system at {0,0}
In[]:=
F={x^2-y^2+x^3,x^2-y^2+y^3};
In[]:=
Timing[multiplicity0MD[F,6,{0,0},{x,y},dTol]]
»
hilbertFunction{1,2,2,1,1,0,0}
»
Depth4
Out[]=
{0.038308,7}
Our second version recalculates the Macaulay matrix at each step, but it stops at the first 0 in the Hilbert function so, since low multiplicities are the most common, is usually the fastest although this could run a long time if the depth is large. The user does not need to give an upper depth. The subroutines are not necessary.
The final version assumes the maximal depth will be less than the sum of the total degrees of the equations. This seems to be valid, although the author has no proof. The advantage is the code is short but the answer is not guaranteed unless the multiplicity is less than the sum of total degrees.
In this case we see multiplicityMD is the fastest but if we tried it with dTol perhaps getting the correct answer was luck.
2.3.3.2 Tangent Vectors
Our function tangentVectorJMD works in simple cases but may not work in near singular cases. An alternate uses the local property of the Macaulay matrix and gives some information about singular points encountered. Unlike the multiplicity finders above we do not expect to apply this to an isolated point so we will use a global Hilbert function rather than the local one used above. These Hilbert functions are related in some sense as integrals or derivatives of each other. The discrete built-in functions Accumulate and Differences will connect these two Hilbert functions.
Our function nrref mentioned above is very important here so we give the code.
Increasing the order of the Macaulay matrix gives more of the Hilbert function. Since this is the accumulation of the local Hilbert function this stabilizes at the multiplicity. In this example we had a regular point so the multiplicity is 1. We could use this to calculate 2-dimensional singularities, note the first argument of tangentVectorMD is a set so even with one equation we need set braces.
we get a tangent vector but it has multiplicity 0.
Going back to example 2.3.3 the cyclic-4 curve
In[]:=
C4={w+x+y+z,wx+xy+yz+zw,wxy+xyz+yzw+zwx,wxyz-1};
In[]:=
tangentVectorMD[C4,{1,-1,-1,1},{w,x,y,z}]
»
HilbertFunction{1,2,1,1,1}
»
Nouniquetangentvectorat{1,-1,-1,1}
we have a singular point of multiplicity 1. We will explain later.
2.4 H-bases
We saw in Example 2.3.2.2.1 that there is a lack of a stopping point in the membership problem but is was suggested that an equation system could be modified so that only one step is needed. This is the main thrust of this section is to describe a type of equation system where this is true.
However there are infinitely many equations that any given curve, or more generally algebraic set, will satisfy. Several algorithms we will see generate a large number of these and we want to pick a good, but relatively small, equation set. The equation sets that satisfy the previous paragraph are good candidates for this.
Fortunately Macaulay in the same 1916 book where he described his Macaulay matrix did come up with an answer. He was using homogeneous equations for projective space and so called this an H-basis. Some authors use the name Macaulay basis for this. In this book , even though we recognize that algebraic curves live in projective space, prefer working in affine space as it is more algorithm friendly. It turns out that H-bases work fine in affine space too.
A system of polynomial equations in
n
variables is a H-basis if the membership problem can always be solved in one step. Specifically a system
F={
f
1
,
f
2
,…,
f
k
}
of polynomials in
n
-variables is an H-basis if given any
n
-variable polynomial
g
of degree
d
it is a polynomial combination of the polynomials of
F
if and only if there exist polynomials
{
g
1
,…,
g
k
}
so that
g
1
f
1
+
g
2
f
2
+⋯+
g
k
f
k
=hwitheach
g
i
f
i
oftotaldegree≤d
In particular suppose
k=3,
f
1
is linear,
f
2
is quadratic and
f
3
is cubic. If
h
is linear then to be a polynomial combination of H-basis
F={
f
1
,
f
2
,
f
3
}
then
h
must be a constant times
f
1.
If
h
is quadratic it can be a linear times
f
1
plus a constant times
f
2
. If
h
is cubic it can be a quadratic times
f
1
plus a linear times
f
2
plus a constant times
f
3
. And so on.
H-bases do exist and every polynomial system is a subset of an H-basis. We will see that Mathematica has a built-in algorithm GroebnerBasis to find one. This algorithm uses abstract algebra so we will not try to explain here how it works. A simple introduction to Gröbner bases is given at the beginning of the book by [Cox,Little and O’Shea] but unfortunately I do not know of an elementary exposition of H-bases that does not require lots of algebra. My position in this book has always been that any algorithm of Mathematica does not require my explanation. The big problem using GroebnerBasis is that this algorithm is intended for integer systems only. Mathematica will try to handle numerical systems but we can’t rely on this working.
If
F
is a H-basis any larger system is also an H-basis. The trick is to find a small H-basis and in the rest of this section I will concentrate on this.
2.4.1 The algorithm hBasisMD
This algorithm (revised 5/2020) which takes a large polynomial system and attempt to find a small H-basis. It will not give an H-basis if the argument
m
is not large enough, unfortunately one cannot know what
m
is large enough in advance. One can check with hBasisMDQ below. The big advantage of this version of hBasisMD is that it works fine with numerical systems which will occur in applications.
There are essentially three steps in this algorithm. The first step is to calculate the Sylvester Matrix for the user given
m
and use the singular value decomposition to find a full rank row space. For numerical systems this essentially replaces the possibly numerically inconsistent input system with a least squares approximation of a consistent system. We then apply a reverse row reduction to find polynomials of small degree among polynomials combinations of the now consistent input system. This is the essential reason for H-bases. These first two steps are contained in the more general matrix reduction procedure arref below. The final, third, step is to return the resulting Sylvester matrix back into a polynomial system and use the membership problem solution to reject polynomials which are already polynomial combinations inside the vector space of polynomials of degree
m
or less of preceding accepted polynomials. The output is what is left after the rejections.
The idea is that arref will allow us to pick out polynomial combinations of our input of lowest degrees. We look at a previous example:
Example 2.4.1.1. We consider example 2.3.2.2.1 where we found a linear polynomial that was a polynomial combination of a fourth and fifth degree polynomial.
In[]:=
f1=x+y-2z+y
2
z
-
4
z
;f2=-
2
x
+y-xy+2xz-
2
z
-xy
2
z
+x
4
z
;
We start by picking
m=6
and calculating the Sylvester matrix and its arref decomposition.
So we have produced our linear polynomial and can hope that
m=7
is large enough, that is that from these 30 polynomial combinations we can get all polynomial combinations of {f1,f2} without relying on cancellation of terms to do our work.
So the algorithm hBasisMD creates a list of polynomial combinations using arref that we hope is a building block for all polynomial combinations of our input system. The first entry of is becomes an element of our proposed H-Basis
H
and we proceed to go down the list using our membership problem method to test if it is a polynomial combination of the previous choices. If not we add it to our list
H.
So we hypothesize that every polynomial combination of our input system is an appropriate combination of polynomials in the list which in turn are appropriate combinations of our list
Although this code is fairly simple we are finding the rank of increasingly large matrices. This can take a long time. A new feature (5/2020) is if A has many rows then the procedure gives temporary output of progress. This could give the user a chance to abort the procedure if the user does not wish to wait. Unlike previous versions there are no options available. A global Hilbert function of the original system and the H-basis are given, ideally the second Hilbert function will stabilize. If not you may wish to try a larger
m
or to use the algorithm hBasisMDQ below to test the output of this algorithm.
Example 2.4.1.1 Continued. We use the algorithm to calculate a H-basis for {f1,f2}.
In[]:=
Timing[hBasisMD[{f1,f2},7,{x,y,z},dTol]]
»
InitialHilbertFunction{1,2,5,7,9,18,22,26}
»
FinalHilbertFunction{1,2,2,2,2,2,2,2}
Out[]=
{44.5493,{-0.5x-0.5y+1.z,-1.y+1.
2
z
}}
2.4.2 hBasisMDQ
Typically hBasisMD is used as a subroutine for other algorithms which return a large set of polynomials to get a smaller set, not necessarily an actual H-Basis. This will terminate with a warning message if some of the input polynomials have degree greater than
m.
Then the one thing that should always happen is that the set H returned will at least generate the input set, so even if one does not get an H-basis something useful is returned.
But one will not get an H-basis if two small an
m
is used. There is no easy a-priori method to guess a large enough
m
but the algorithm in this subsection should be able to check to see if you do have an H-basis.
So you can use the output from hBasisMD in hBasisMDQ. If using hBasisMD as a stand-alone procedure you may wish to run hBasisMDQ first to see if you already have an H-Basis and to get a value of
m
that should work.
hBasisMDQ works by comparing the input system with a known H-Basis. By default this the output of the built-in Mathematica function GroebnerBasis with option MonomialOrder→DegreeLexicographic. As mentioned before this is not cheating the reader as I have never promised to explain built-in functions, only my own. Normally Gröbner Bases only work for systems with integer coefficients, Mathematica’s will attempt numerical systems but I offer no guarantees. In particular GroebnerBasis will flag inconsistent systems and abort. An over-determined numerical system that may be fine in other places in this book may look inconsistent to GroebnerBasis.
The syntax is hBasisMDQ[F,H, X, tol] where
F
is your known system,
H
is the system you wish to check to see if it is an H-basis. As usual
X
is the variable set, and tol is desired tolerance. Note that
m
is not used as input so one does not need to know
m
beforehand. Here, especially, the order of the variables matter, Lexicographic is respect to the order in X for instance the built in MonomialList used in GroebnerBasis returns a different list depending on the way the variables are listed:
As an option hBasisMDQ will treat the first argument F as a known H-basis and check the argument H against that. This could be useful, for example, if one has an H-basis but is concerned that it is not minimal. This may be, for instance, the case for the H-basis returned by GroebnerBasis.
Our function hBasisMDQ gives an information notice with the size of the Gröbner Basis and a list of total degrees of polynomials present. In the case above where the option useF→True this information refers to F rather than the Gröbner Basis which is not calculated. If it is determined that H is a Gröbner basis the the procedure returns only the value True. Otherwise it stops at the first instance an element of F is not expressible in terms of the polynomials in H. If the Gröbner Basis (or optionally F) has polynomials of degree 1 but H does not then hBasisMDQ flags that fact and stops, doing no calculations. Otherwise it gives the degree of the missing polynomial and which polynomial of the Gröbner Basis in that degree it is and halts. For the user’s convenience this routine creates a global variable lastHBGroebner so this polynomial can be retrieved.
Using the last sentence above the user could use hBasisMDQ perhaps several times to find a minimal H-basis from the Gröbner basis, but if the Gröbner basis is large probably it is better to use hBasisMD with the
m
given by the largest degree in the Gröbner basis.
The procedure hBasisMDQ works by running the membership test above on each member of the Gröbner basis, or optionally the H-basis F. Here is the code
Gröbner bases may be large so this routine could take a long time to run, especially if H is an h-Basis. But since it stops at the first omission this version does not give running information. Again, this could give a false negative in the numerical case, but a return of True should be reliable.
2.5 Duality, Union , Intersection and decomposition of Curves.
Already in 1916 Macaulay talked about the dual to his Macaulay matrix. Duality will play a small but important technical role in our considerations.
2.5.1 Duality
Given a matrix, generally a Macaulay or Sylvester matrix,
M
the dual matrix is a matrix with independent columns with the property that
M.=0
where here 0 represents the zero matrix of the appropriate size. Essentially a dual matrix of
M
is just a matrix whose columns give a basis for the null space of
M.
We will assume our matrix has numerical entries so instead of using the built in NullSpace procedure we choose a tolerance and use the following, see for example Appendix 1 of my curve theory book.
In this case the difference is that dualMatrix gives a column vector rather than giving the nullspace basis as rows. If we had used an integer matrix then we would have
Note that this is properly a row matrix, that is the rows form a basis for the left null space.
Traditionally the dual of the Macaulay matrix was considered to be a space of differentials describing the local structure. The left (or local) dual of this should recover our original. We will typically be interested here in the dual of the Sylvester Matrix with the left dual of that recovering our curve.
The union of two space curves is more difficult. For plane curves we simply multiplied the equations. But in space we have several equations for each. The trick is to go to duals, duality takes unions to intersections and vice versa. So we take the dual matrices of appropriate Sylvester matrices and then join these, note same
m
. Then we take the local dual of the combined dual matrix. The question is how big do we make the matrices. Here the idea of H-bases helps. We make sure each Sylvester matrix is large enough to contain an H-basis. And at the end we give the result as an H-Basis.
Example 2.5.3.1: We will take the union of a line and the twisted cubic for a relatively easy but non-trivial example starting from H-bases (recommended).
Even though the twisted cubic is not a naive curve, the union is, in fact this is a quadratic surface intersection curve (QSIC), see section 3.2. An unintended feature of my hBasisMD function is that even though the line was given by numeric equations the end result is integer! One could have exploited that immediately at the input level
In[]:=
hBasisMD[ln,2,{x,y,z},dTol]
»
InitialHilbertFunction{1,1,1}
»
FinalHilbertFunction{1,1,1}
Out[]=
{-2.-1.x+1.y,-2.-3.x+1.z}
We will look at this example again. For now note that we could also calculate the intersection.
In[]:=
NSolve[Join[twc,ln]]
Out[]=
{}
But this is actually wrong, Mathematica does not like numerical systems of 5 equations in 3 unknowns! Using exact representations
In[]:=
{x,y,z}/.NSolve[Join[twc,{-2-x+y,-2-3x+z}]]
Out[]=
{{2.,4.,8.},{-1.,1.,-1.}}
We might expect a third point since we are intersecting a cubic and a line, but it is a well known fact that no 3 points on the twisted cubic are co-linear [see Harris].
Note it is easy to plot this curve since both components are parametic
So this is the intersection of 3 quadric surfaces. In Section 3.2 below we study the classification of curves given as the intersection of 2 quadric surfaces, QSIC, and although there are examples with three lines, this shows that not all unions of 3 lines in
3
are QSIC.
2.5.3.3 Here is one more example relevant to Section 3.2
Since all the pieces can be parameterized it is easy to plot. Again this looks similar to a QSIC but is not a QSIC. [See C. Tu, W. Wang, B. Mourrain, J. Wang case numbers 23-26]
2.5.4 Decomposition of reducible curves.
Unlike the plane case where the single equation of a reducible curve factors, possibly with irrational complex coefficients, the equation system for a reducible space curve, see our examples in the previous section, do not factor. For Example 2.5.3.1 the two equations are give smooth quadric surfaces and thus not factorable.
Without fully describing a space curve the only sure way to test for irreducibility is to use one of the higher powered solvers such as [PHCpack] or Bertini [Bates, Hauenstein, Sommese]. I give my solution to this problem below but it may require plotting the curve first using methods later in the book.
2.5.4.1 Dual Interpolation
We saw in Section 2.5.3 that duality takes unions to intersections, that is the duals of components can have separate rows in the dual matrix. We exploit this by considering the dual matrix of the curve and attempting to find equations describing a given component. But first we need a technical subroutine.
To try to explain, in principle the Sylvester Matrix of high enough order contains all the information necessary to determine the curve. One property of a curve is the Macaulay information at a point. Of could recover the equation, perhaps using duality and hBasisMD and take the Taylor series at that point which can be used to do a hand calculation of the Macaulay matrix. Or figure out how this works within the dual matrix. At one point your author did this in general but don’t ever ask him to show his work but the answer is encoded in this Mathematica procedure.
The following example gives some idea how this might work.
Example 2.5.4.1.1 See Example 2.5.3.1 the union of a line and twisted cubic.
In[]:=
F={-x^2+y+xy-z,2x^2-2y+y^2-xz};p={1,1,1};
I start with the answer, the Macaulay matrix at this point. Since this is a regular point the interesting part of this is the first two rows. Since the two equations become separated we look at the equivalent nrref form.
We don' t quite get the original system but the surprise is the first 3 equations define the twisted cubic, not the union
F
which was the only input data. This is because we started with a Macaulay matrix which gives only local information at the point
p={1,1,1}
and doesn’t see the line. We would get better results if we used additional points on the twisted cubic and/or higher order. So this will give us our algorithm for finding equations of irreducible components of reducible curves.
We come to our most important procedure in this book. We have already introduced Mathematica’s TransformationFunction which is otherwise known as a projective linear transformation or linear fractional transformation. As the reader is well aware your author prefers the name fractional linear transformation, FLT. These transformations can have any dimensional domain and range and are given by transformation matrices which are
(n+1)(k+1)
matrices where the transformation goes from
k
⟶
n
.
Possibly they could be complex as well. Thus an example
4
⟶
2
could be
Example 2.6.0.1
In[]:=
A=RandomInteger[{-9,9},{3,5}];
In[]:=
A={{7,8,-4,6,-8},{9,-2,-6,0,-2},{9,6,-3,7,2}};
In[]:=
A//MatrixForm
Out[]//MatrixForm=
7
8
-4
6
-8
9
-2
-6
0
-2
9
6
-3
7
2
In[]:=
TransformationFunction[A][{w,x,y,z}]
Out[]=
-8+7w+8x-4y+6z
2+9w+6x-3y+7z
,
-2+9w-2x-6y
2+9w+6x-3y+7z
I also have alternate notation
In[]:=
fltMD[{w,x,y,z},A]
Out[]=
-8+7w+8x-4y+6z
2+9w+6x-3y+7z
,
-2+9w-2x-6y
2+9w+6x-3y+7z
In my Plane Curve Book and Chapter 1 of this book I restrict to invertible square transformation functions and give also corresponding functions FLT2D, FLT3D which take equations to equations. This makes these much more useful. Actually FLT3D will work for any dimension
n
as long as the transformation matrix is invertible. These work equation by equation by simply composing each equation with the inverse transformation.
In the general case, however, we don't have an inverse transformation and the number of equations in the range may be more or fewer than equations in the domain. In the example above a curve in
4
would have 3 or more equations but a curve in
2
has only one. Thus many of the techniques we have introduced in this chapter, in particular Sylvester matrices, duality and H-bases, will be used.
The key is, as in FLT2D, FLT3D, is that the transformation of equations works naturally in the opposite direction as the transformation of points. But duality turns this around: the transform of dual spaces works in the same direction as the transformation of points. The other thing is we will have to deal with is the fact that these transformations are actually transformations of projective space so we will need to work with homogeneous polynomials. Then these FLT will be simply linear transformations rather than rational functions. We will need the following simple subroutines
So we take the Sylvester matrix of our domain system, dualize, map the duals with a linear version gmapMD of our transformation, return with the localDualMatrix getting a large system which we reduce using hBasisMD. Because this may be time consuming we do add some options to help the user keep track of what is going on. There are also some warning messages included all making the code somewhat longer than usual in this book.
is the transformation matrix, X,Y are the variables for the domain, range respectively.
m
will be the order of the Sylvester matrix used so it must be at least the largest total degree of a polynomial in
F
but it often needs to be larger. Especially when dealing with numerical data the tolerance may need to be loosened. Since most interesting FLT are numerical this is one good reason why I have been working numerically. It does help if
F
is an H-basis.
Because of the choices this some what of a trial and error type of algorithm, it probably works in good cases but is not guaranteed. It is therefore a good idea to check the results. The important property that the output
As mentioned above a transformation matrix for a transformation
n
⟶
k
is a
(k+1)×(n+1)
matrix
A
.
I will mention here that since transformation functions are essentially projective transformations that that the matrix is homogeneous in that if one multiplies all entries by the same non-zero real (or complex) number the transformation remains the same.
If
A
is square, that is
k=n
, and
-1
A
exists then the transformation is invertible. Geometrically this means that if FLTMD takes curve
F
to curve
G
then these curves are isomorphic, that is geometrically the same.
G
may be rotated, reflected, translated or the infinite hyperplane may have been moved or possibly all of the above. Some positional attributes may have changed such as critical points, infinite points or number of affine topological components. But geometrical attributes such as number of ovals or pseudo-lines, algebraic irreducibility and number and characteristics of singular points remain unchanged. For invertible transformation functions one may use FLT3D instead of FLTMD even if
n
is not 3. This will be much quicker and the number of equations will not change.
If the last r0w is
{0,0,,…,0,1}
, or by homogeneity the last entry is some other non-zero number, then we call this transformation function and it’s matrix affine. This means the infinite hyperplane remains in place and we are just messing with the affine geometry. While critical points may change have the same infinite points and same number of topological components. This latter fact is the original meaning of the word affine. The formula flt[X,A] will be a list of polynomials rather than rational functions. For example
then the transformation function is a linear transformation. In this case we may strip
A
by removing the last row and column to get a
k×n
matrix, that is
A
=
Drop[A,-1,-1]
In[]:=
A={{1,2,3,0},{5,6,7,0},{0,0,0,1}};
A
=Drop[A,-1,-1]
Out[]=
{{1,2,3},{5,6,7}}
Then we can actually perform the transformation just by matrix multiplication
In[]:=
fltMD[{x,y,z},A]
A
.{x,y,z}
Out[]=
{x+2y+3z,5x+6y+7z}
Out[]=
{x+2y+3z,5x+6y+7z}
The process of stripping is reversible, that is a linear transformation
n
⟶
k
given by an
n×k
matrix
A
will give a transformation matrix in the sense of this section by, for example
In[]:=
B={{1,2,3},{5,6,7}};
B
=Append[Join[B,{{0},{0}},2],{0,0,0,1}]
Out[]=
{{1,2,3,0},{5,6,7,0},{0,0,0,1}}
In[]:=
B.{x,y,z}fltMD{x,y,z},
B
Out[]=
{x+2y+3z,5x+6y+7z}
Out[]=
{x+2y+3z,5x+6y+7z}
Finally, a linear transformation
B
is an orthogonal transformation if the rows, equivalently columns, form an orthonormal set. In the real case only, for a
k×n
matrix,
k≤n
this means
B
.Transpose[B] is the
k×k
identity matrix or if
k≥n
then Transpose[B].B is the
n×n
identity. For complex matrices one uses the ConjugateTranspose. Orthogonal transformations preserve Euclidean geometry, that is that lengths and angles are preserved which does not necessarily happen with affine transformations in general. More importantly operations with orthogonal transformations are more numerically stable, so since we often work with numerical transformation matrices this is good. On the other hand orthogonal matrices almost always have irrational entries and so numerical methods are preferred with them.
Two utility functions that may be useful are given below, they allow us to go between linear transformations and FLT transformations.
Later we may start with a FLT projection, that is an FLT with fewer rows than columns. These are more general in that infinite points may become affine. These are not really more general as it can be shown that projecting a curve with an arbitrary FLT projection is the same as transforming the curve with an invertible FLT and then doing a linear projection on the image. It is a bit hard to show this so rather than give a proof we just give an algorithm to accomplish this although in practice we will rarely use this.
Generally different random projections will be defined as above for each application. However we could also define a random projections, with some constraints on the random numbers used and use this projection many times. Such a projection is called pseudo-random. An example is our default pseudorandom projection