Welcome! Log In Create A New Profile

Advanced

Reprap Module Communication Working Group

Posted by Annirak 
Reprap Module Communication Working Group
April 21, 2009 01:00PM
I don't know if this should go in software or electronics. I'm going to post it in electronics, but post a link to the discussion in the software forum.

As Reprap's level of sophistocation goes up and people start to add self-contained modules to extend Reprap's core functionality, interoperability starts to become an issue.

What I want to do here is try to define a communication specification for using that RS-485 bus, which will allow reprapers to reuse eachothers' modules without going through extensive firmware rewrites (which could compromise the function of the module). In essence, if we can define a communication specification, then we will be able to treat each module as a self-contained unit in both software and hardware.

I'm going to make a series of posts to this thread which lay out my ideas for a communication standard. I have worked with a few different communication specifications recently, including several generations of ethernet, SCSI, and radio communication. I'm hoping that I can bring what I know of these highly defined specifications in to reprap in order to build a fairly simple but robust and extensible communication specification.

In this post, I'm going to lay out the background of the interoperability problem we face. In the next posts, I'll go through the 7 OSI communication layers, present several options for each, along with my recommendation for how to implement them.

The majority of reprappers who have been behind the RS-485 adoption have encouraged a half-duplex, single master, shared bus topology. While standardizing on half-duplex RS-485 is a good step towards interoperability, but it doesn't go far enough. If we leave the spec at "RS-485, half duplex," we leave out the majority of the communication challenge.

The problems we need to address next are essentially the 7 OSI communication layers. Working from the bottom up, they are: Physical, Data Link, Network, Transport, Session, Presentation, Application. We don't actually need to define all of these, but I will get into this more as I go.

To frame the problem more clearly, here's an example:
Suppose that you are building a reprap. You want to use a tool-head exchange system. Now, someone else has already developed a toolhead changer with an RS-485 interface, and made the source available for building it. You also want to use a dual-extruder print-head (filler and material) built by someone else. Now, while the two are mechanically compatible, their communication spec is completely different. The toolhead exchanger needs 9600 baud, while the extruder is designed (right down to how it times its synchronization) for 115200 baud. If that weren't enough, the extruder will automatically send information messages and status updates without checking if the bus is free. This causes all kinds of problems because the extruder's data tends to get mixed up in the toolhead exchanger's communication.

Fixing this system would require a major rewrite of the extruder's code, or a separate RS-485 bus. Both of these are counter-productive; it would be much better if they were both designed to interoperate from the start.

Preventing people from having to rewrite eachothers' code for the sake of communication interoperability is the goal that I am striving for.

If you have any comments, post away!
RS-485, Physical Layer
April 21, 2009 01:52PM
Summary (more details below):
My recommendation is: Half-duplex RS-485 bus over CAT-5 with RJ-45 connectors, with some power delivered on the same cable. Each device would implement two RJ-45 connectors in a pass-through configuration. 9600 baud for negotiation, communication up to 10Mbps once negotiatied, depending on device capabilities. Parity will be replaced by LRC or CRC. Collision detection is implemented by bus readback and compare.

Details:

While it sounds like the physical layer is already decided, it's not all the way there.
The factors that are decided are:

-Electrical specification: RS-485
-Symbol encoding: none
-Duplexing: Half
-Topology: Bus
-Signal coding: NRZ


The remaining elements to decide are:
1)Connector specification
2)Wire specification
3)Baud rate
4)Parity
5)Collision detection
6)Termination characteristics*

Negotiation must be deferred to a much higher layer because available hardware does not support it.

Taking each of the remaining points in turn,
1) A large number of connectors are used for RS-485. Converting between different connectors is not an overly hard task at the speeds we are dealing with, so this portion of the spec can really be more of a recommendation than a requirement. While screw terminals are an effective, easy, and common solution, I think that modular connectors are actually a better solution. If we used RJ-45 connectors, they would provide 4 pairs which I would divide as follows: inner pair 1-data, inner pair 2-data2, outer pair 1- +12V, outer pair 2-GND.
Each device which implements this connector must implement an electrical passthrough, with a second RJ-45 connector to allow daisy-chaining boards. Star-splitter boards would be permitted.

2) Following from 1), CAT-5 is cheap and plentiful. It would allow us to provide signal and power pairs over the same connection. While the power would only be suitable for low power applications, it would provide more than enough for remote sensors to operate.

**I would not make 1) and 2) major requirements, just recommendations.

3)Baud rate should default to a low, universally supported baud rate, but allow negotiation (in a higher layer) to higher rates. The default rate would be 9600 baud. I would set the maximum rate to 10Mbps. Only capable devices would need to implement transcievers up to that signal rate. The recommended operating rate would be 1Mbps; due to timing and synchronization, faster signals are more useful to us, but 1Mbps is about the highest rate that most of the AVRs that are in use on Repraps support. 10Mbps transceivers are widely available and around $2 each in single unit quantities.

4)I'm not sure whether the reprap environment is noisy enough to warrant parity. I'm also not sure whether or not we would be better served by an LRC or CRC than parity.
[EDIT:] If you're going to do it, do it right. Since there are CRC libraries readily available, the protocol will use CRC on all frames.

5)Collision detection can practically only be implemented by each transmitter comparing the data they receive with the data they transmit. Higher layer detection cannot distinguish between transmission errors and collisions. For this reason, it would seem better to implement collision prevention than collision detection.

[EDIT:]
6) (Thanks to jbb for this point) Termination will become important at high speeds, so a termination scheme is required if the bus will reach rates over 250kbps. The recommended terminator will be a dongle that plugs into an open RJ-45 port. The termination resistance is 100ohms (matches CAT-5).

The bus controller MUST implement a termination scheme. This may simply be a RJ-45 jack, with a dongle, but if the terminator is not hard-wired, the documentation that goes with the host must clearly state that the terminator dongle is required.

The bus controller MUST implement a default bus bias scheme. The bias will be 100kOhm resistors, one pulling the D+ line up, one pulling the D- line down. This will ensure that the bus remains in the idle state when no data is being transmitted.

Edited 2 time(s). Last edit at 04/21/2009 06:27PM by Annirak.
jbb
Re: Reprap Module Communication Working Group
April 21, 2009 05:05PM
Hi Annirak

Defining a comms standard now is a good idea - it's best to get a standard in place before there is a large code base to modify. I've got 4 comments:

1) A few years ago I cooked up some RS232 based hardware for use on a production line. I found that per-character (hardware) parity was not effective and would let bad characters through, and a standard additive checksum was likewise not very good. I had to use a cyclic redundancy check (CRC) to get good robustness.

I realise that RS485 is much less sensitive to noise (which is why we're using it), but I still suggest that you abandon parity checks and just use a CRC at the end. The WinAVR library includes some fast CRC routines that work well.

2) You may also wish to specify termination characteristics. At low speeds it won't matter much but if you want to go all the way to 10 Mbps the lines must be properly terminated. There should be a spec for a termination plug which is plugged into each end of the bus. Additionally the bus master should possibly have some bias resistors so that the bus defaults to a known state under open circuit conditions.

3) Depending on the capabilities of the device UARTs, it may not be possible to achieve all desired bit rates. Your Network (I think) layer will need to understand what bit-rates each node can actually achieve, not just the maximum bit-rate.

4) Following (3), it seems that there should be some form of device descriptor that the host can request from each node at startup. At the least it should store the name, version and comms capabilities of the device.

Keep up the good work.


jbb
Re: Reprap Module Communication Working Group
April 21, 2009 06:16PM
Hi jbb,
You're absolutely right on all of those points. I didn't realise that there was fast CRC available in the WinAVR library.

Edited 1 time(s). Last edit at 04/21/2009 06:35PM by Annirak.
Re: Reprap Module Communication Working Group
April 22, 2009 08:04AM
Hi Annirak, JBB.

Definitely worth some consideration.
I saw Zach put a couple of RJ45 on his latest motherboard. I considered it, as I had still to finalize my PCB order from Hong Kong, but decided against it, as I know, from the service angle, how frustrating it can be to have connections requiring specialist tools, and not to have that tool. You can get RJ45 crimpers at a reasonable price, but for a hobby assembler, it's more practical to make a lead of the correct length and connect it oneself, than to have to go and buy a special lead and/or tool and connectors.

I'm using much lower pull up and pull down bias resistors on the RS485 lines, 1k. Remember they are only on the master device.

Personally, I would use parity AND a CRC.
You you really want to skip parity, then use at least two bytes for your CRC, over hours of time transmitting messages under marginal conditions, a one byte parity check is asking for trouble, especially without other tests.

Most messages are going to be short, so parity and a single CRC byte would be effective. It's hard to bugger up either one, without catching the error on the other test. Parity is low overhead, as virtually all USARTS come with automatic parity and parity tests too. Many have framing error detection in hardware too.

Baud rates:
I would stick with the common ones.
Baud rates are easiest to generate accurately from specific crystals, or by going to a lower baud rate, where the divisions allow more options, and with better accuracy. Most microcontroller clock pretty fast these days, but you may also want to hook up a unit to a PC via an Rs 485 convertor, so you need a baud rate that's on the microsoft or Linux common Comms rates. Over a meg, and communications are going to be very difficult any way, esp if not using slew rate limited transceivers. I would propose a default of 38400 baud, as it's fast enough to be useful, slow enough to be still be serviced by a busy microcontroller, and it's an easily generated clock rate for most systems. Someone might like to check on the reprap boards though, I'm not sure what crystals Zach specifies for them, or if there's an error rate on the division to 38400 baud. I generally select crystals that give me the most baud rate options when I'm designing an embedded product using asynch serial.

My STM32 Arm cortex M3 based motherboards are due to arrive on Monday smiling smiley
Re: Reprap Module Communication Working Group
April 22, 2009 01:17PM
RJ-45 is quick to connect, easy to use, the connectors are cheap, the connection is secure (unless you break the connector). In short, it has a large number of advantages that no other connectors do. The cost of the tool will be offset by the cost of connectors in many systems.

An RJ-45 crip tool is less than $20. Actually, the "deluxe" tool is $15 at this store: [www.rpelectronics.com].
Honestly, even radio shack sells the tool for under $20.
[www.radioshack.com]
[edit:] These guys ship. Their crimp tool is over $20, but it's *really* nice; ratcheting and everything.
[ncix.com]

Tool cost is not a limiting factor.

CRC8 (not parity, not even close to parity) will be fine for packets that are 256 bits or less. CRC16 is necessary for packets that are up to 65536 bits

I thought 9600 would be a good negotiation baud rate. Negotiation is the process where each module identifies itself and its capabilities to the bus controller. When all the modules have negotiated, the bus controller will set the system baud rate to the maximum acceptable to all devices. So there is no "default" baud rate, there is a negotiation rate, which is intentionally low.

I do not want to limit the design specification based on the possibility that someone somewhere might some day want to connect a PC. If you want to connect a PC, then have it either be the bus master and dictate the maximum baud rate, or have it report itself as a device and report its maximum baud rate. This is why negotiation will be done at 9600 baud, which is the most commonly supported baud rate I know of.

But we're getting ahead of ourselves here. The next thing to cover is the data link layer That will be coming some time today if I can work the kinks out.

Edited 2 time(s). Last edit at 04/23/2009 12:48PM by Annirak.
Reprap Data Link Layer
April 22, 2009 02:25PM
Data Link Layer
In this layer, we need to define
1) A frame format
2) A mechanism for either recovering from collisions, or preventing them from happening.
3) A bus arbitration scheme

Here are my thoughts:
1) The frame format is pretty straight-forward, it uses a 6-byte wrapper. Control frames use an 8-byte format

Device addresses are 1-byte. 0x00 is the bus controller address. 0xff is the general call address.

Data:[target address][frame type][function][length][length*data][CRC16]
Control:[target address][frame type][function][4*data][CRC8]
The control frames will allow most common operations without requesting a data frame.

All devices must support control frames. The bus controller must support data frames and control frames. Any device may optionally support data frames. This means that devices which do not support data frames do not need to implement CRC16.

2) For preventing collisions, I think it makes the most sense to use a time division multiplexing scheme, with the bus controller broadcasting a regular beacon.

The generalized structure of the communication is:
[beacon identifier][timestamp][data slot addresses][host control frames][device control frames][data slots][dead time]

[beacon identifier]
The beacon identifier needs to be unique in the communication. One way to do this is to deliberately insert a framing error, otherwise known as a break. Since framing errors can occur in normal communication, it makes sense to follow the framing error with a specific character to lower the chances of mistaking a framing error for a beacon. The beacon identifier should also not be mistakable for an accidental transmission at a lower baud rate. So I propose that the beacon identifier should be:
[break][0xAA]

[timestamp]
The timestamp needs to be fast enough to coordinate the movements of all the hardware in the system and each beacon should have a new timestamp--that is no two successive beacons should have the same timestamp, but the timestamp should recycle slowly enough that long movements by a print head do not need to use multiple timeouts.

A 1msec interval should be frequent enough for coordinating mechanical interactions. This makes the maximum wait period of a 16-bit timestamp 65.5 seconds. I don't think this is long enough for our purposes, which means that we need at least one more byte. 24-bits would work admirably well. I'm undecided on whether to use 24 bits or 32.

[data slot addresses]
This essentially forms the bus arbitration scheme. In this section of the frame, the bus controller inserts one controller address for each available data slot. Only the addressed device may transmit data in the corresponding data frame.

[NOTE:]Since this is effectively the last frame of the beacon, we need a CRC here. We could put the CRC later, but that would mean that host control frames and device control frames would have to be treated differently.
[CRC8] This CRC will cover only the timestamp and the data slot addresses.

[Host control frames]
The host is given special priviledges and may transmit more than one control frame per beacon. I'm not sure how many more is necessary, that something that we still need to decide.

The control frames all take the format that I listed above.

[Device control frames]
Each device is given one control frame timeslot per beacon. The position of the timeslot is given to the device during negotiation. A 1-byte idle time is present between each device control frame to prevent over-runs.

[data slots]
The data slot length needs to be decided, so does the count of data slots per beacon.

[dead time]
I think there should be at least one byte of idle time before the beacon to prevent misinterpretation of a framing error as a beacon.


===============

The beacon overhead must be taken into consideration, so beacons need to be small. Ultimately, this works out to an ugly tradeoff. Either we have a high beacon frequency, where devices can transmit frequently, and control frames have very low latency, but data takes a long time to transmit because of a small data frame size, or we transmit a beacon less frequently, suffer increased latency, but lower overhead, and higher data rates.

Most devices should be able to support a 100kbps data rate, certainly any MCU with a 2MHz or greater crystal can do so. This will even allow PCs to communicate on the bus (115200). I think this is the ideal target for the communication spec.

Given our beacon timing format, here's what I propose: (you can paste this into a spreadsheet)

identifier,multiple,bytes,format
[beacon identifier],1,2,[BREAK][0xAA]
[timestamp],1,4,time since startup in msec
[data slot address],5,1,[device address]
[CRC8],1,1,[beacon checksum]
[host control frames],3,9,[target address][frame type][function][4*data][CRC8][idle byte]
[device control frames],32,9,[target address][frame type][function][4*data][CRC8][idle byte]
[idle byte],1,1,[idle]
[data slots],5,134,[target address][frame type][function][length][128*data][CRC16][idle byte]
Total: 1002 bytes
This gives: 5 data slots of 128 bytes each, 3 host control frames,and 32 device control frames (this makes the practical limit of the bus 32 devices + a host)

Using this format, a beacon interval of 10msec is achievable on a 100kbsp bus, with a little idle time between beacons. Increasing the data rate will increase the data frames and the devices supported.

3) bus arbitration is mostly covered above. Practically speaking, the bus controller will grant data frames to devices which request them. Each device receives a timeslot for when it can transmit control frames during negotiation.

As always, let me know what you think!

Edited 2 time(s). Last edit at 04/23/2009 12:51PM by Annirak.
Re: Reprap Module Communication Working Group
April 22, 2009 06:18PM
One significant drawback of the bus spec I have listed is that every device which is not transmitting must receive transmitted data, and parse for the target address, it cannot simply ignore the data. Because CRC checking can take significant time, this could be onerous for large data packets. A better approach might be to CRC both the header and the data. That way, the header for each data frame could be checked and the data could be ignored if the listening device was not the target.

This still leaves all devices listening and parsing all control frames.

While this may sound onerous, I should point out that devices are not *required* to transmit a control frame every beacon. They are simply *permitted* to. This does mean that all devices need to maintain a counter so that they can determine which timeslot a byte arrives in. This is less onerous, however, than requiring transmission in every timeslot.
Re: Reprap Module Communication Working Group
April 22, 2009 07:31PM
CRC is not onerous. For an 8 bit CRC (which good for packets up to 256 bits i.e. 32 bytes) then it is only two XORs and a table lookup of 256 bytes.

A 16 bit CRC, which is what I would use, is just a wider table: 256 words.


[www.hydraraptor.blogspot.com]
jbb
Re: Reprap Module Communication Working Group
April 22, 2009 10:43PM
Hi guys

Nophead is correct: a well-written table based CRC16 will take ~15 cycles per character on an AVR. If the CRC is updated as each character comes in by the UART receive interrupt then the latency drops right off. Calaculating the CRC at the end would, it's true, be slow (tens of microseconds).

An additional point: if you elect to simply ignore any message with an error in it (and let the timeout), it is possible to do an early bailout by checking the address byte. If a packet is not addressed to (cool smiley, (A) can ignore it. If it _is_ addressed (A) but the address is corrupted then (A) wouldn't have done anything anyway, and so can ignore it. If it was addressed to (cool smiley but was corrupted to say (A) then, yes, (A) would waste time checking things before it noticed the CRC error, but that won't happen very often.

jbb
Re: Reprap Module Communication Working Group
April 23, 2009 03:27AM
jbb,
That makes plenty of sense.

Nophead,
I'm not so sure about the CRC length. An n-bit CRC will accurately detect any error burst of up to n bits.

Because CRC can be modeled essentially as a divide, the number of possible data blocks that will match an n-bit CRC is 2^l/2^(n)=2^(l-n)

The probability, assuming completely random data, of a false positive CRC match is 2^(l-n)/2^l=2^-(n)

for CRC8, this is 1/512. Note that this does not depend on data length.

CRC is good at catching bursty errors. An n-bit CRC will catch up to n sequential error bits successfully.

Are we really expecting the bus to be so bad that we need 1/65536 probability and up to 16 bits of successive errors?
Re: Reprap Module Communication Working Group
April 23, 2009 04:27AM
From [en.wikipedia.org]
Quote

CRCs can be used to detect all double bit errors in the input message shorter than 2^k, where k is the length of the longest irreducible part of the polynomial.

That means that you lose the property of being able to detect all double bit errors when your number of bits is bigger than 2^k. That is why protocols with short packets use 8 bit CRC. Longer packets use 16 bit CRC and file checksums use 24 or 32 bit CRC.

It isn't an abitrary choice, it has some mathematical basis.

When looking at the error detection rate you have to consider how many packets are sent during an object build and how many machines are operating. It soon becomes billions of packets. If the odd undetected error gets by then the machine will get a reputation for being unreliable since a small error can completely ruin a print.


[www.hydraraptor.blogspot.com]
Re: Reprap Module Communication Working Group
April 23, 2009 12:23PM
I had to do some more reading to see exactly where you were coming from. For the reference of anyone who is not familiar with the performance of CRC,

An N-bit CRC offers
* 100% detection of single-bit errors;
* 100% detection of all adjacent double-bit errors;
* 100% detection of any errors spanning up to N-bits;
* 100% detection of all two-bit errors not separated by exactly 2^N-1 bits
* For arbitrary multiple errors spanning more than N bits, at worst 1 in 2^N failures

On that basis, we only need CRC8 for control frames, but we do need CRC16 for data frames.

The CRC questions not withstanding, are there any comments/concerns about the rest of the DATA Link Layer's protocol?

[edit:] I have updated the sections above to include the CRC16 information. Because most operations should be possible without data frames, I've specified that data frames, and therefore CRC16, are optional for all devices except the bus controller. The bus controller must implement data frame handling.

The biggest reason to make data frames optional is that I don't want an application, which would otherwise need only a very small processor, to require a larger processor just to handle two types of CRC and storing 128-byte data frames.

As large processors become more prolific and cheaper, this becomes less of an issue, but people are still hanging on to PIC16 and PIC18, so it seems relevant.

Edited 2 time(s). Last edit at 04/23/2009 01:02PM by Annirak.
Re: Reprap Module Communication Working Group
April 23, 2009 02:10PM
CRC16 requires on the order of 24 instructions per byte on most PIC/AVR systems. That is without use of any tables. Total cost of receiving streaming data should be under 4 instructions per databit, so even a 4MHz PIC should be able to receive data at 1 mbit/s while computing a CRC16.

-Geert
Reprap Network Layer
April 23, 2009 05:33PM
The network layer for this protocol is relatively minimal. We only support 254 devices and one bus controller. With the protocol I described for the data link layer, there was only support for 32 devices and one bus controller.

To make this extensible, we should specify a frame router. It will be relatively trivial to implement, though it will require a minimum of 2 UARTs. On one side, it will be a device, while on the other, it will mimic the bus controller. Much like the bus controller, the router will issue beacons. In the "host control frames" section, it will insert broadcast frames, and frames bound for devices that it has negotiated with. During negotiation, a router will notify its bus controller that of the devices that it has attached.

To make this viable, we may need to add the source address to control and data frames, rather than leaving them as implied by timing.

Routers should work on a store & forward basis. This will mean that routers can be used to do baud-rate conversion, allowing them to be used couple low-baud rate protocols to high-baud rate protocols.

If we operate on a ignore CRC error packets basis, then the router will take care of retransmission of packets that it received correctly, but that the device did not acknowledge.

Edited 2 time(s). Last edit at 04/23/2009 05:36PM by Annirak.
jbb
Re: Reprap Network Layer
April 25, 2009 01:58AM
Errr, two questions:

1) If you have a 'light' device that doesn't do data frames, how will it report back to the bus master what it is and what baud rates it can support?

2) How about using the same CRC check (I feel CRC16 would be best) for both stages - it will reduce coding effort and CRCs can be tricky to debug?

jbb
Re: Reprap Module Communication Working Group
April 25, 2009 06:07AM
I forgot to mark this thread.

Annirak wrote:
"
An N-bit CRC offers
* 100% detection of single-bit errors;
* 100% detection of all adjacent double-bit errors;
* 100% detection of any errors spanning up to N-bits;
* 100% detection of all two-bit errors not separated by exactly 2^N-1 bits
* For arbitrary multiple errors spanning more than N bits, at worst 1 in 2^N failures

On that basis, we only need CRC8 for control frames, but we do need CRC16 for data frames.
"

It's easy to get missled by the 100% figures in the above paragraph. Closer examination reveals some simple assessments of the statistics of the errors that actually slip through though.


Like nophead wrote:
"If the odd undetected error gets by then the machine will get a reputation for being unreliable since a small error can completely ruin a print."

And like I wrote:
"...over hours of time transmitting messages under marginal conditions, a one byte parity check is asking for trouble, especially without other tests."

I stick by my suggestion of using parity as well, for the reasons I've already stated.

The concern with errors, is that "two wrongs can make a right", and eventually, in a bad coms environment, it will happen, a data error, and a CRC error that align in agreement, when tested. In a clean communication environment, the discussion is purely academic. In a real world environment, the statistics of error detection can be vital.

re tool cost of the crimp tool for RJ45, I have one. I need it for my work. Most people don't. A crimp tool at $20 U$D, that someone will only use a few times- I think most people would prefer to buy a flat bladed screw driver. Certainly an RJ45 is neater though, and if people are OK buying replacement leads and folding the excess, then it's acceptable.

Annirak wrote:
"
I do not want to limit the design specification based on the possibility that someone somewhere might some day want to connect a PC.
"

Um... like to get the hardware and software tested and debugged prior to final networking ?
Limitations based on... not using crystals from which you can easily derive standard PC Baud rates ?

A 4.9152 MHz crystal divides down quite nicely to the popular baud rates, or you can go higher(9.830400MHz, 19.66MHz etc), 9600 baud is on the zero error scale for these crystals. Most people look for round numbers on the end of their crystals, not their baud rates, and then end up stuck with only a few error free baud rates. A reprap~CNC machine doesn't really need crystal accuracy for anything else, so determining factors for baud rate compatibility are primarily crystals.
Re: Reprap Module Communication Working Group
April 27, 2009 01:31PM
jbb,
using the same crc for both kinds of frame makes sense. The problem that I have with it is that it adds a significant number of bytes to the data frame. It will bring the total from 1000 to 1036. If we can still get an adequate beacon rate in there with an error-free baud rate, then I'm all for it.

Incidentally, I don't think that we should necessarily follow traditional baud rates. They don't much matter once you pass 115kbps anyway.

grael Wrote:
-------------------------------------------------------
> It's easy to get missled by the 100% figures in
> the above paragraph. Closer examination reveals
> some simple assessments of the statistics of the
> errors that actually slip through though.

True, but I'm not going to get in to CRC32, MD5, or SHA1 for this kind of application. While they would give much better results, the computation is significantly more intensive and the result takes a much larger data block.

> I stick by my suggestion of using parity as well,
> for the reasons I've already stated.

Is the argument here that CRC16 will detect multiple errors within a byte, but for multiple errors that span multiple bytes, CRC16 may fail, but parity (CRC1) may catch it? Of course if you get pairs of errors in multiple bytes, all bets are off, but CRC16 has only a 1:65536 chance of having a failed bit.

> The concern with errors, is that "two wrongs can
> make a right", and eventually, in a bad coms
> environment, it will happen, a data error, and a
> CRC error that align in agreement, when tested. In
> a clean communication environment, the discussion
> is purely academic. In a real world environment,
> the statistics of error detection can be vital.

Honestly, I think the environment will be much less noisy than expected. This is why I'm specing things like CAT5e and RS485. These are designed for industrial environments and noise prevention. I don't think we're going to get a lot of comms errors.

> re tool cost of the crimp tool for RJ45, I have
> one. I need it for my work. Most people don't. A
> crimp tool at $20 U$D, that someone will only use
> a few times- I think most people would prefer to
> buy a flat bladed screw driver. Certainly an RJ45
> is neater though, and if people are OK buying
> replacement leads and folding the excess, then
> it's acceptable.

One big advantage of RJ45 is it guides people towards the use of CAT5e. With screw terminals it's more attractive to use unknown impedance, untwisted pairs, which could destroy >1Mbps communication.

> Annirak wrote:
> "
> I do not want to limit the design specification
> based on the possibility that someone somewhere
> might some day want to connect a PC.
> "
>
> Um... like to get the hardware and software tested
> and debugged prior to final networking ?

No, you should be doing that through the bus controller.

> Limitations based on... not using crystals from
> which you can easily derive standard PC Baud rates
> ?

I'm talking about limitations based on slow interfaces to PCs.
Outstanding questions
April 27, 2009 03:19PM
I think that our specification is progressing nicely, before I go on and lay out my ideas for the top few layers of the spec, I thought I should give a summary of the undecided issues in the spec.

Physical layer:
--Style of connectors--Is the use of an RJ-45 crimp tool prohibitive to the specification?
--Cable specification--Should cable other than CAT5 or better be permitted? If so, for what speeds?
--Strength of steering resistors on the bus controller PCB

Data link layer:
--Beacon based communication--is it suitable?
--Beacon period--how slow is too slow? Is 0.01s fast enough for a physical system?
--Slot sizes--should the number of slots available be sized for the available devices during negotiation? That would mean that devices would need to be able to configure themselves to handle multiple lengths of control slots
--CRC, parity--Is CRC16 good enough? Which polynomial should we use? Is it necessary to implement parity as well as CRC16?

Network layer:
--Is it necessary to make routers? If so, should we add a source address to control & data frames?
Transport layer & session layer
April 27, 2009 04:38PM
The transport layer in our spec handles only one thing, retries for packets.

The best solution that I have come up with for this is adding several control frames:
Acknowledge: [ack][fid][sender][original beacon timestamp][crc]
This would indicate that the function [fid] was received at [original beacon timestamp] from [sender] and will be processed.

CRC error: [ncrc][fid][sender][original beacon timestamp][crc]
This would indicate that the function [fid]--which may be incorrect--received from [sender]--this will be correct because it is identified by time slice, not transmitted data--at [original beacon timestamp] suffered a CRC error.

Negative Acknowledge: [nak][eid][sender][original beacon timestamp][crc]
This indicates that the function id requested by [sender] at [original beacon timestamp] failed due to [eid], which could indicate: buffer full, busy, function not supported.

These are pretty rough ideas, and need some more refinement.

Session Layer:
This layer is where all the details start to come out. The session layer will cover how negotiation is performed and how hotplugging, if any, is performed. In addition, this will handle error counting and failover to slower baud rates when excessive errors are present. The majority of this is handled by the bus controller.

Negotiation:
-Addresses
First we must decide if device addresses should be assigned by the bus controller or preprogrammed by the developer. I think the best option is probably to take a third approach, which is similar to I2C. Different classes of device should have different base addresses which have offsets that are configured using jumpers, so that multiple copies of the same board may be used on the same bus without rebuilding the firmware for each. While this is just a suggestion, some method must be undertaken to ensure that there are only unique addresses on the bus.

-Discovery
The bus controller must initiate a discovery sequence at 9600 baud. Because we only have a maximum of 254 possible devices, it's practical to ask each of them to identify.
The sequence needs to be clearly defined on the bus, so that any device can recognize that a discovery is in progress.

To initate a discovery on power up, the host will send the beacon identifier followed by a device address ranging from 0x01 to 0xFE:
[break][idle byte][0xAA][addr]

It will wait up to two byte times after sending the identifier to start receiving the identification frame.

The identification frame must contain some information about the device which will allow the bus controller to negotiate with it and report it to the control software.

In protocols such as USB, this is known as a device descriptor. In SCSI, this is the response to a "Report General" frame.

Before we get to the organization of the data, here's the information we need. A lot of the details in this section relate very closely to the application layer, however because they are a part of the negotiation process, I'll gloss over them now.

Device address
Supported data rates--I don't know if this should be a bitfield or actual numeric data rates
Device identification string

Device class (servo driver, stepper driver, position sensor, temperature sensor, solenoid/relay driver, heater driver, rotation sensor (not the same as position), distance sensor, image capture sensor, hybrid device.)

If the device identifies as a hybrid device, then it may list as many devices as it contains. For example, a pinchwheel extruder would identify as a hybrid device with a stepper motor driver, a rotation sensor, temperature sensor, and heater driver.

Complex device class (none,additive toolhead controller, subtractive toolhead controller, manipulative toolhead controller, motor controller, temperature controller)
These devices are hybrid devices which do their own processing and offer an abstract communication & control interface.

Device specific data
For a toolhead controller, this will indicate tool sizes, movement rates, materials, etc.

So, for the actual structure,

[device address (1)]
[baud bits(4)] -- see below
[idstrlen(1)][id string (idstrlen)]
[devclass(1)]<[devclasscount(1)][dev classes(devclasscount)]>
[complexclass(1)]
[devdatacnt(1)][devdata(devdatacnt)]
[crc16]


baud bit representations, a 1 in the bit position of the left-hand number indicates support for the rate to the right.
0- 9600
1- 19200
2- 38400
3- 57600
4- 76800
5- 100000
6- 115200
7- 153600
8- 200000
9- 230400
10- 250000
11- 307200
12- 460800
13- 500000
14- 614400
15- 921600
16- 1000000
17- 1152000
18- 1228800
19- 2000000
20- 2304000
21- 2457600
22- 2500000
23- 4608000
24- 4915200
25- 5000000
26- 9216000
27- 9830400
28-10000000

After discovery is complete, the bus controller will issue a broadcast control frame for switching data rates, and increase to the maximum data rate supported by all devices.

If the bus controller issues a break at 9600 baud, this will appear as multiple breaks at any higher data rate. Because of this, any time a device receives multiple breaks, it must immediately switch to 9600 baud, wait for the rx line to go high, then prepare to receive a 0xAA. If it receives a 0xAA at 9600 baud, then it will wait to be enumerated in the fashion described above.
Re: Transport layer & session layer
April 28, 2009 03:32AM
Annirak Wrote:
-------------------------------------------------------

> -Addresses
> First we must decide if device addresses should be
> assigned by the bus controller or preprogrammed by
> the developer. I think the best option is
> probably to take a third approach, which is
> similar to I2C. Different classes of device
> should have different base addresses which have
> offsets that are configured using jumpers, so that
> multiple copies of the same board may be used on
> the same bus without rebuilding the firmware for
> each. While this is just a suggestion, some
> method must be undertaken to ensure that there are
> only unique addresses on the bus.
I think that's an excellent suggestion Annirak.

> -Discovery
> The bus controller must initiate a discovery
> sequence at 9600 baud. Because we only have a
> maximum of 254 possible devices, it's practical to
> ask each of them to identify.
> The sequence needs to be clearly defined on the
> bus, so that any device can recognize that a
> discovery is in progress.

I know you don't like designing a limitation in on purpose Annirak, but I'm curious as to how many devices you think there could reasonably be, on this network.
With the sort of frame length you are talking of, and arbitration over many slave devices, the average 8 bit microcontroller is ruled out as a bus master, unless it has off chip memory expansion. It's been done before, but here the key is high speed synchronisation, I think.

Although I appreciate the advantages of modularity (sometimes), performance can take a big hit if you are trying to optimise but lack hardware resources.

Much of what this forum is involved in, I would class as "expert systems", in which code space on the microcontroller, and processing speed, are often critical to performance improvements. A typical microcontroller isn't as good at hiding the beaucratically inefficient aspects of such a system as a PC is.

So, I'm a little bewildered by your expenditure of such personal effort on a highly structured multi layered protocol. If it works, there's then the question of testing adherence to the specification you are working on, else, consequences for non adherence.


I take your point about RJ45s and cat5e twisted pair, but I'm not so sure it's a specification breaker. There are many operating systems (for example in the security industry) that almost exclusively use screw terminal termination and over building wide areas. In practice, the ends can be wired screw terminal one end, and RJ45 the other, and still work. RJ45 doesn't handle very large conductors though, so it's a connection method that's restrictive of using power down the same cable. You can double up other conductors, if so, it needs to be decided on now, rather than later.

Regarding baud rate negotiation, what happens if some critical devices fail to agree on a common baud rate above 9600 baud ?

You end up being reliant on crystals, minimum error % in baud rates, and fast enough crystals that the dividing numbers give sufficient fudge factor to get a close enough baud rate. The PC is still a great debugging tool for this stuff !

Another thing to consider, (and this one's partly my fault for recommending half duplex), is that if you want to optically isolate, you need to have control over how soon to release the bus after a transmission, when the bus drive pin is not under direct control of the device originating the transmission burst.


> Device class (servo driver, stepper driver,
> position sensor, temperature sensor,
> solenoid/relay driver, heater driver, rotation
> sensor (not the same as position), distance
> sensor, image capture sensor, hybrid device.)
>
> If the device identifies as a hybrid device, then
> it may list as many devices as it contains. For
> example, a pinchwheel extruder would identify as a
> hybrid device with a stepper motor driver, a
> rotation sensor, temperature sensor, and heater
> driver.
Again, this is asking for more master PCB processing power than most in the reprap forum currently have on their motherboards. You can ask for all that information, but then you've actually got to do something with it !

>
>
>
>
> baud bit representations, a 1 in the bit position
> of the left-hand number indicates support for the
> rate to the right.
> 0- 9600
> 1- 19200

> 26- 9216000
> 27- 9830400
> 28-10000000
Many of those faster baud rates are out of spec for the common RS485 chips.
Another complication to the issue of compliance...
Re: Reprap Module Communication Working Group
April 28, 2009 07:11AM
Greetings all,

A question from out of the blue (and not meant to disparage the considerable amount of thinking on this matter), apart from using a differential pair for the signalling, wouldn't an inter-module comm system built upon the SNAP protocol again serve adequately for reprap?

I realize that the last SNAP-based reprap design was implemented in PIC firmware (which I never built; I have an aversion to PICs), but the protocol was specified, and was reasonably expandable. The token-ring approach has its limitations, but I suspect that it could (at a fast enough baud rate/with suitable ECC) be made to serve. Similarly, I think I could deal with assigning addresses manually.

Just food for thought, and again not meant to run down others' thoughts and efforts to improve reprap.


Larry Pfeffer,

My blog about building repstrap Cerberus:
[repstrap-cerberus.blogspot.com]
Re: Reprap Module Communication Working Group
April 28, 2009 01:30PM
Larry_Pfeffer Wrote:
-------------------------------------------------------
> Just food for thought, and again not meant to run
> down others' thoughts and efforts to improve
> reprap.

Hi Larry,
You're right about reusing existing design work. It makes sense. The problem is the bus topology. A token ring network is inherently contention free. The difference with a half-duplex bus is that every device receives everything which is transmitted. That makes rebroadcasting for each "token pass" highly inefficient. The target device might as well grab its data on the first pass. But there needs to be an arbitration method to ensure that the bus stays contention free.

This is why I suggested the beacon method. However, that being said, I've made some errors in the design, which I intend to fix soon.

grael Wrote:
-------------------------------------------------------
> I know you don't like designing a limitation in on
> purpose Annirak, but I'm curious as to how many
> devices you think there could reasonably be, on
> this network.

Actually, I was planning for 32 as the maximum of the bus. That doesn't mean that their addresses are contiguous.

> With the sort of frame length you are talking of,
> and arbitration over many slave devices, the
> average 8 bit microcontroller is ruled out as a
> bus master, unless it has off chip memory
> expansion. It's been done before, but here the key
> is high speed synchronisation, I think.

This is less of an issue than it appears. Data need only be saved in a given device if the destination address matches that device's address.

> Much of what this forum is involved in, I would
> class as "expert systems"

Expert systems are software which attempts to reproduce the performance of one or more human experts. Reprap is hardly an expert system. In fact, I'd say that reprap outperforms human experts in the field of building a 3D object from layers.


> A
> typical microcontroller isn't as good at hiding
> the beaucratically inefficient aspects of such a
> system as a PC is.
> So, I'm a little bewildered by your expenditure of
> such personal effort on a highly structured multi
> layered protocol. If it works, there's then the
> question of testing adherence to the specification
> you are working on, else, consequences for non
> adherence.

You're right. There is a little beauraucratic inefficiency in the protocol I've outlined. There are some optimizations to be done on it. This is why I called this a working group. I want input and suggestions so that we can make this spec the best it can possibly be. I had a few thoughts last night and I think I'm going to completely rewrite the data link layer.

Don't get too caught up in the multi-layered nature of the protocol. All protocols are multi-layerd. It's just a structured way to organize the information about the protocol.

> I take your point about RJ45s and cat5e twisted
> pair, but I'm not so sure it's a specification
> breaker. There are many operating systems (for
> example in the security industry) that almost
> exclusively use screw terminal termination and
> over building wide areas. In practice, the ends
> can be wired screw terminal one end, and RJ45 the
> other, and still work. RJ45 doesn't handle very
> large conductors though, so it's a connection
> method that's restrictive of using power down the
> same cable. You can double up other conductors, if
> so, it needs to be decided on now, rather than
> later.

I have spec'd two conductors for both 12V and GND. I agree that the use of CAT5e may not determine whether a system will work or not. I always did say that this part of the spec is a recommendation.

>
> Regarding baud rate negotiation, what happens if
> some critical devices fail to agree on a common
> baud rate above 9600 baud ?

Good question. I'd say that the bus controller should warn you that the negotiated baud rate is very low and tell you why. You might need to change a crystal and some dividers.

> You end up being reliant on crystals, minimum
> error % in baud rates, and fast enough crystals
> that the dividing numbers give sufficient fudge
> factor to get a close enough baud rate. The PC is
> still a great debugging tool for this stuff !

This will always be a problem for devices that are built by different people. We have a choice here. We can either allow many baud rates, or require specific ones. Then the baud rate support then dictates only the maximum.

> Another thing to consider, (and this one's partly
> my fault for recommending half duplex), is that if
> you want to optically isolate, you need to have
> control over how soon to release the bus after a
> transmission, when the bus drive pin is not under
> direct control of the device originating the
> transmission burst.
>
This is an issue for all half-duplex, shared bus protocols. In our case, every device knows when it should be transmitting. If it is not transmitting, it must be receiving.

> > Device class (servo driver, stepper driver,
> > position sensor, temperature sensor,
> > solenoid/relay driver, heater driver, rotation
> > sensor (not the same as position), distance
> > sensor, image capture sensor, hybrid device.)
> >
> > If the device identifies as a hybrid device,
> then
> > it may list as many devices as it contains.
> For
> > example, a pinchwheel extruder would identify as
> a
> > hybrid device with a stepper motor driver, a
> > rotation sensor, temperature sensor, and heater
> > driver.

> Again, this is asking for more master PCB
> processing power than most in the reprap forum
> currently have on their motherboards. You can ask
> for all that information, but then you've actually
> got to do something with it !

Ironically, I think you've got it backwards. If a device identifies as a hybrid, it's just saying that it will accept multiple command sets. Generally speaking, devices shouldn't be accessed in hybrid mode. Direct control of operating parameters would be good for debugging, but not recommended. Most hybrid devices would also be "complex" devices. These would be things like motor controllers and toolhead controllers. Each of these is a hybrid device, but each of these has a very reduced command set. I'll get into this more in a later post, but the idea is that the majority of the processing for a toolhead should be offloaded to a toolhead controller, leaving the bus controller free to do other things.

For example,
An additive toolhead has a primary function: it puts material down. It can do this with a given size and speed. The controlling factor for most systems is the size--that is that the size required dictates the speed. So a bus controller would tell the toolhead controller how big a line to draw. The toolhead controller would reply with the maximum speed for that size of line. Then the bus controller would tell its motor controllers how fast to move and where to stop, and tell the toolhead controller to start drawing.

Please don't get into the details on this concept yet. It's just an overview to explain the idea of a complex device.

> > 26- 9216000
> > 27- 9830400
> > 28-10000000
> Many of those faster baud rates are out of spec
> for the common RS485 chips.
> Another complication to the issue of compliance...

It's not so bad. That's kinda the point of the baud rate support field. But you're wrong about being out of spec for common devices. There's a very nice DIP RS485 level transceiver for $0.80 that does 10Mbps.
[search.digikey.com]

Annirak
Data Link Layer, V2
April 28, 2009 03:31PM
A new (better?) approach to sharing the bus.

The idea behind the beacon system was to prevent collisions. The protocol guaranteed that there would be a regular pulse of beacons that devices could use to communicate. While this seemed like a good idea at the time, it's more of a networking concept than a microcontroller bus concept. While I don't think that the idea is bad, per se, I do think that it absorbs too much time on the bus, using data frames when they aren't necessary, and continuously broadcasting beacons which may not be necessary, but must be processed by all the devices on the bus.

So how do we prevent collisions if there isn't a steady stream of beacons? And how do we insert data slots only where they are needed?

The problem with any device transmitting at a given moment is that its data may get overwritten by another device's data. One way to prevent this is to guarantee that all the devices transmit the same data if they transmit at the same time.

To accomplish this, here is my suggestion: All devices need a communication timer. If the bus has been idle for one timeout period, any device may signal that it wishes to transmit by transmitting a break. Because another device may have simultaneously transmitted a break, each device must transmit the break condition only until it receives a break interrupt. Following the break interrupt, the bus will be idle for two byte times. Following that, the bus controller will transmit a control beacon:

[break][0xAA][timestamp(4)][crc16(2)]
The beacon is followed by an [idle byte] and control frame timeslots for each device
[host control frame(9)] ([device control frames]*(1-32))

This is much smaller than my previous beacons.
If a data frame is requested by a device, or the bus controller wishes to transmit a data frame, it will send the following beacon sequence:
[break][0x55][timestamp(4)][sending device address(1)][crc16(2)]
After an [idle byte], the addressed device may send a data frame with up to 128 data bytes

If more than one data frame is requested in a beacon, the bus controller will queue requests for data frames and alternate them with control beacons, to allow high priority communication not to get swamped.

As long as there is any activity in a control timeslot, the bus controller will transmit one more control beacon. If a control beacon is idle (and there are no more data requests in the queue), the bus controller will allow the bus to go idle, and wait for a framing error (break) or an internally queued data or control request before it transmits a new beacon.

To maintain synchronization, the bus controller will transmit a control beacon at a minimum rate of once per second.

I think this cleans up the protocol quite a bit, and should reduce excess processing as well.

Edited 1 time(s). Last edit at 04/28/2009 03:36PM by Annirak.
Re: Data Link Layer, V2
April 28, 2009 04:13PM
As an observation from outside this discussion, it does seem odd that you are reinventing the wheel. There are plenty of protocols out there, and surely one of them fills what it needed and is available off the shelf.

As far as I can see this discussion started with some decisions already made, and no clear definition of what is actually needed.

Can I suggest that it would be a good idea to specify what is actually needed and then look to find what interfaces and protocols might fit the bill.

David
Re: Reprap Module Communication Working Group
April 28, 2009 05:36PM
Another thought re inter-module comms:

Use dedicated arbitration hardware:
Instead of attempting to resolve contention entirely in band (that is, on the data lines themselves), would it not be easier to adopt a (stripped-down) version of (one of the many) bus request/grant schemes? These typically used a geographically-ordered priority to handle the simultaneous-request problem, and used dedicated hardware to handle the passing/grabbing of bus access. Similarly, I think bus-hold periods were enforced in hardware in some.
The minimal hardware is probably a small CPLD, maybe just a flip-flop or two.

A related thought: since we have little/no legacy hardware to support with this scheme, why not adopt a pretty high baud rate as our default? Faster coms generally leads to greater bus availability. If those TI transceivers can do over a megabaud, why not adopt a rate (supported by common crystals) that is a respectable fraction of what those chips can do? 9600 was a good de-facto standard, with 15+ year-old hardware, but today, it's a sloooow crawl, IMHO.


Larry Pfeffer,

My blog about building repstrap Cerberus:
[repstrap-cerberus.blogspot.com]
Re: Reprap Module Communication Working Group
April 28, 2009 08:12PM
David,
I'd be all for using CAN, but it severely limits the available MCUs that can do the job, and restricts what debugging options are available. An RS485 scheme makes sense. The protocol needs to be suitable for our needs in reprap. I'm trying to tailor a protocol to reprap.

Larry,
The only compatibilty I'm trying to maintain is hardware compatibility with those who have bought the rrrf toolhead controller and the rrrf motherboard. I'm not sure if that's a good goal or not, but it does preclude the use of dedicated arbitration hardware. Those two units actually use this very same transceiver (in SOIC form) no doubt due to the extreme baud rate and low cost.

What does everyone think of my revised arbitration scheme?

1) Send a break if the bus is idle, otherwise go to 2
2) Wait for a beacon from the bus controller, then transmit your control frame
3) ask for more space if you need it
4) when you're done, let the bus go idle.
Re: Reprap Module Communication Working Group
April 29, 2009 04:19AM
I always use a simple master / slave protocol on RS485, so never need the complexity of arbitration. I can't think of a reason why the tool heads can't be slaves and the main controller a master. That is how HydraRaptor works.


[www.hydraraptor.blogspot.com]
Re: Reprap Module Communication Working Group
April 29, 2009 05:33AM
Annirak,

My point is in your response.

You say "The protocol needs to be suitable for our needs in reprap". Where is the definition of what reprap needs? I have seen nothing about bandwidth, about address ranges, about whether this is master to slave, or slave to slave, about response latencies, or any of these things. Surely these need to be defined first.

David
Re: Reprap Module Communication Working Group
April 29, 2009 06:29AM
What crystals are being used in the RRRF motherboards ?

I'm still considering a crystal for my machine, from memory, I used to use 3.6864 MHz, or 7.3728 MHz quite a lot, because I could do perfect real time clock divisions, also I could get some useful baud rates with them. I need to check if these, or a multiple of them will get my factory boot loader running on the STM32F103VBT6 chips I'm using, I think 8MHz has been used with success on the boot loader, but it does auto negotiate to an extent, like the HC11. The STM32 can synthesize higher frequencies, I'll be trying to run it close to the current 72MHz max internal clock rate.
Sorry, only registered users may post in this forum.

Click here to login