SNAPComms

From RepRap
Revision as of 12:25, 17 February 2007 by AdrianBowyer (talk) (version migrated from twiki)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

The SNAP communications protocol in the PIC firmware

This is just a cut-and-paste from Simon's comments in the code.

Implementation of interrupt driven SNAP communications routines for use in a token ring configuration. Being completely interrupt driven improves performance and simplifies development of the core logic in each device. This is not really a true token ring in that there is no token frame and the procedure for ring insertion etc. is trivial -- it is however a network with ring topology.

Overview/notes:

A receive buffer accepts payload data as it arrives. Upon completion, a global flag is set that acts as a lock to prevent further receives occuring until the lock is removed. If a receive does occur, but it is for somebody else, it is passed onto the next node in the loop. If the receive is for ourself then we fail and NAK the packet so that it will be re-sent at a time we can hopefully act on it.

The lock flag also indicates to the main loop that data is awaiting and the main loop is responsible for calling any processing on the data. It is not called directly by the ISR to prevent re-entrancy problems. The act of receiving the byte will wake the CPU and allow it to check for the present of the lock. After processing is complete it may sleep if it wishes. It will be woken after every byte is received but can just repeatedly sleep again if it likes.

The main loop that is acting on the lock flag must process the command and send any necessary data. It must also wait for an ACK or NAK before finally removing the lock and allowing further receives.

When sending any packet (including ACK or NAK packets), a timeout is started. If the timeout expires, the response is considered to be a NAK and the ACK/NAK is resent. The timeout should be generous enough to allow for full ring propagation with worst-case delays. If the packets comes back to the sender, it is also treated as a NAK. An error counter should limit the number of re-sends before dropping the packet and returning an error.

When data is received, only the payload is available to the main loop. A copy of the source address is also saved until the lock is released. This allows replies to be sent regardless of other packets received or forwarded during processing.

For the moment, data packets will not have ACK/NAK piggybacked with them and each will be sent separately. This is because ACKs are automatically and immediately sent by the ISR routines before a response is even computed.

@todo NAKing a packet while busy should ideally send a special NAK that indicates busy as opposed to failed CRC, etc. This would allow a small pause before re-sending, rather than resending immediately and probably causing the same problem again.

General API:

- Main loop inspects processingLock flag. If set, it actions the data

 in buffer.

- A reply is optionally constructed by calling sendReply. This uses

 the saved source address to send appropriate header bytes.  Nothing
 much happens here because the header can't be constructed until
 the packet is complete (length is unknown).

- Packet payload is sent by repeatedly calling sendDataByte

- The sending is completed by calling endMessage, which will

 send the actual packet by constructing a header, length, body and
 CRC for the message.

- awaitDelivery is called to wait for a response. A duplicate of

 the entire packet is kept in an additional buffer so that if a NAK
 arrives the same data can be re-sent without bothering the client.
 This method should do very little as the handling of this is
 interrupt driven.  If called, it will block until the delivery is
 complete and return fail/success.

- deliveryStatus returns the same information as awaitDelivery

 (except tristate values indicating still sending, success,
 failure).  This does not block however.

- When sending a new message rather than a reply, the sendMessage

 function is called with the destination address.

- Call releaseLock to indicate processing is complete amd allow

 any necessary cleanups.  If no ACK is received yet, this
 will block until it arrives.  If endMessage is not called,
 the packet is dropped.

In order for the routines to work the ISR must call the interrupt handler serialInterruptHandler()

When we get a packet not destined for us or with headers we don't understand, we just pass them on. In theory, a corrupt packet could therefore just be passed on by everybody, forever. To get around this we could buffer the packet and check the CRC, then only send it on if all is well. However in doing so we greatly increase the latency. To prevent the possible long-term buildup of rogue packets it is assumed that there is a node in the ring (such as a more powerful PC) that will check things more thoroughly and mop up any problem packets that are cycling the network. By only having one such node in the network, the latency effects are minimised.

@todo An enhancement that may be needed is something to deal with too much data arriving. eg if fully occupied with incoming data and a local transmit is occasionally needed, eventually transmitting will block while waiting for the TSR to become free (it won't be able to contain all the outgoing data). This will mean received data is lost and the packet will become corrupted. However at least the next packet will also become corrupted. This situation should be detected and if anything arrives during blocking transmits, they should be cleanly dropped up until the packet ending. This improvement just decreases the number of lost packets, but is a little complex so it may or may not be worth doing. Also in most cases for a local transmit to be needed, there would also be a command received, which would be consumed leaving more buffer space. Also responses from slave devices are not expected to overwhelm the network so badly.


Problems with SNAP:

Error correction is optional. That means the flag itself could be corrupted and no error correction will take place. It should be mandatory and cover the header.

The destination address should occur sooner so packets can be passed on in the network as soon as possible to decrease latency (only relevant in a token ring situation).

The lengths are not continuous up to the sizes we want.

A lot of the other stuff is superfluous.

An ARP protocol like SMBus has might be nice.


-- Main.AdrianBowyer - 17 Feb 2007