From RepRap
Jump to: navigation, search

This page contains details of the "FiveD" integrator I developed for my Repic electronics board. The algorithm performs perfect 5D accelerated extrusion on a low-end 16-bit device (in my case a PIC micro-controller) using integer-only addition in the inner loop with zero accumulated error. The input is the XYZEF parameters from a GCodes G1 extrusion command, the output is the raw clocking signals to send to the stepper motors. The code assumes that the steppers are being driven by a fixed-rate interrupt routine, in the case of Repic this is set at 10,000 interrupts per second.

Mathematical Derivation

The math behind FiveD extrusion is the same as that for accelerated motion, where 'feed rate' is analogous to 'velocity'. From Newtonian physics we have the basic equation of accelerated motion:

1) <math>v = u + a * t</math>

where <math>v</math> = final velocity (end feedrate), <math>u</math> = initial velocity (start feedrate), <math>a</math> = acceleration, and <math>t</math> = time

By integrating with respect to t we can calculate total movement/extrusion distance over a given time period:

2) <math>s = u * t + 0.5 * a * t^2</math>

This is the equation we will use at run-time to determine how far the stepper motors should move. First, we must calculate 'a' which we can do if we know the total time for the entire extrusion. By rearranging 1) to solve for 'a' and substituting into 2) we arrive at equation 3) which allows us to calculate total extrude time 't' given start and end feed rates and the total distance 's', which is how extrusion is specified in gcode:

3) <math>t = 2 * s / (u + v)</math>

Note that conceptually this makes sense, it's simply distance divided by the average feed rate.

In the case of RepRap the feed rates are specified in mm/minute, so the calculated 't' value is converted from minutes to the interrupt tick-rate on the micro-controller (i.e. 10000/second in the case of Repic). This value is then rounded to the nearest integer and equation 1) is used to calculate 'a'. The rounding of 't' to the nearest integer often results in a small error being introduced to 'a' which in turn can cause the target feed rate value 'v' to be different at the end of the extrude. This error is well within the tolerences of RepRap though (typically < 0.01%) and the rounding ensures that the final iterated values of movement and extrusion equal the target 's' values exactly, which is more important.

To do the interpolation we convert to stepper values by multiplying distance by the step rate of the stepper motor in question (typically ~80 for the X and Y axis on Mendel). Each step requires two interrupts, one to set the driver signal to 1 and another to set it back to 0, so we double this value and then scale equation 2) by the result in order to calculate the number of step signals generated for any given t. We do not, however, need to fully evaluate equation 2) for each iteration in the main loop because s(t+1) can be calculated as a function of s(t):

<math>s(t)</math> <math>= u * t + 0.5 * a * t^2</math>
<math>s(t+1)</math> <math>= u * (t+1) + 0.5 * a * (t+1)^2</math>
<math>= u * t + u + 0.5 * a * (t^2 + 2 * t + 1)</math>
<math>= (u * t + 0.5 * a * t^2) + a * t + 0.5 * a + u</math>
<math>= s(t) + (a * t) + (0.5 * a + u)</math>

Algorithm Implementation

Since s(t+1) is a function of s(t) we can implement the algorithm iteratively with floating-point variables using addition only in the inner loop:

   ddNumSteps = a
   dNumSteps = 0.5 * a + u
   numSteps = 0
   for (i=0; i<t; i++, numSteps += dNumSteps, dNumSteps += ddNumSteps)
      ... use numSteps here...

In order to implement this code on a 16-bit PIC micro-controller the values are stored in 32-bit integer variables as 1:31 fixed-point values. Only 1 integer bit is needed for dNumSteps and ddNumSteps because RepRap cannot generate more than 1 step per interrupt tick, i.e. both values are always <= 1. We do lose all integer bits in numSteps apart from the LSB, but that's ok because only the LSB is needed to determine when the signal sent to the stepper driver should be toggled.

Unfortunately, even 31 bits of precision can cause enough loss of precision over the course of an extrude to result in the final value of numSteps being slightly off-target. The total amount of this accumulated error at any given moment can be calculated exactly and is equal to the fractional component of ddNumSteps lost after conversion to 1:31 multiplied by the number of iterations:

   ddNumStepsScaled = ddNumSteps * (1 << 31);      // scale up for 1:31 fixed-point
   ddNumStepsInt = (int)ddNumStepsScaled;          // and store in a 32-bit integer
   errorTerm = (ddNumStepsScaled - ddNumStepsInt); // the difference is the error for each iteration
   totalError = errorTerm * t;                     // total error for the entire extrude

We can adjust for this by calculating this total error term iteratively using Brensenham's line algorithm and adding the integer component of the result to numSteps during the inner loop in addition to dNumSteps. This results in perfect FiveD integrated extrusion with zero accumulated error over the ranges of distance, feed rate and acceleration typically used in RepRap.

Source Code

The following C# code calculates the parameters needed for the inner loop, in the case of Repic this code is executed on the main host and therefore uses double-precision arithmetic:

 // do 5D extrusion in XYE plane. For performance reasons I do Z separately
 // with a regular 4D iterator, but it could just as easily be done here.
 protected Extrude5D(
       Vector5D Start,                     // XYZEF at the start of the extrusion
       Vector5D End,                       // XYZEF at the end of the extrusion
       Vector5D stepRates)                 // number of ticks per mm for each XYZE stepper, F is interrupts per second (10000)
   // calculate distance in the XY plane i.e. |End.XY - Start.XY|
   // if distance is 0 then set it to extrude distance: (End.E - Start.E)
   double distance = CalcDist(Start, End);
   // solve for t
   double fi = Start.F;
   double fe = End.F;
   double t = 2 * distance / (fi + fe);
   // convert t from min to interrupts (10,000 per second) and round to nearest integer
   int ticks = (int)Math.Round(t * stepRates.F, MidpointRounding.AwayFromZero);
   // calculate a modified value of a to account for a whole number of interrupts
   double ascale = stepRates.F / ticks;
   double a = 2.0 * (distance * ascale - fi) * ascale;
   // convert deltas from mm to steps, multiply by 2 to account for turning steppers on and then off
   Vector5D delta = End - Start;
   delta.X = Math.Round(delta.X * stepRates.X * 2, MidpointRounding.AwayFromZero);
   delta.Y = Math.Round(delta.Y * stepRates.Y * 2, MidpointRounding.AwayFromZero);
   delta.E = Math.Round(delta.E * stepRates.E * 2, MidpointRounding.AwayFromZero);
   // calculate some values we'll need for conversion to fixed-point
   double fpscale = 1U << 31;
   double tscale = t / ticks;
   // calculate the iteration values for X    
   double ax = a * delta.X * tscale * tscale / distance,
          fix = fi * delta.X * tscale / distance,
          dx = (ax * 0.5 + fix) * fpscale,
          ddx = ax * fpscale;
   int    dxi = (int)Math.Floor(dx),
          ddxi = (int)Math.Floor(ddx),
          slopeX = (int)(ticks * (ddx - ddxi)),
          errorX = (int)((dx - dxi - 1) * ticks);
   // do the same for Y and E

The 32-bit integer values (dxi, ddxi, slopeX and errorX) are sent to the firmware which loops 'ticks' number of times and does the integration. Whenever the MSB of "x" goes positive we clear it and toggle the stepper pin; this is effectively the same as calculating Floor(x). However, it's mathematically more "correct" to calculate Round(x) and we can do this easily by initializing x to 0.5 i.e. 0x40000000 in 1:31 fixed-point arithmetic. The inner loop itself is as follows:

   if ((x & 0x80000000) != 0)
       x &= 0x7fffffff;
       // toggle stepper pin here
   x += dxi;
   dxi += ddxi;
   errorX += slopeX;
   if (errorX >= 0)
       errorX -= currentTicks;
   // repeat for Y and E

In general this will probably be the fastest code as the branch conditionals will more-often-than-not evaluate to false. However, if your target platform allows quick access to individual bits then using pre-calculated deltas might be faster:

   x_stepper_pin = x.bit31;
   x.val += dx;
   if (errorX.bit31)
       dx += ddx1, errorX.val += de1;
       dx += ddx2, errorX.val += de2;

Similarly, if your hardware doesn't like conditional branching but has good support for indexed memory access then it may be faster to store ddx1/de1/ddx2/de2 in 2-element-long look-up tables to avoid stalling the execution pipeline:

   x_stepper_pin = x.bit31;
   x.val += dx;
   dx += lut1[errorX.bit31];
   errorX.val += lut2[errorX.bit31];