This is version 14, the latest and best.

Overview:


Note that my same (but not word) aligned speed is faster than uclinux's word aligned speed, once copy length is long enough.

Overview of copy length to 64 bytes:


Graphs show copy lengths from 0 to 256 bytes. Click any graph to see it zoomed in to show only 0 to 64 byte copy lengths, or click here to see all graphs showing 0-64 bytes.

(Raw data can be found here; raw data sorted into groups by type of alignment, here. Each raw data line is in the form:
dst address % 4, src address % 4, copy length, microseconds (usec) for C memcpy, usec for kernel memcpy, usec for uclinux memcpy, usec for my memcopy (version 14), the string "CKUE")

Word aligned copies (dst and src both are word aligned):


Word aligned copies (dst and src both are not word aligned, but have the same (mis) alignment):






Mis-aligned copies ( dst and src have different alignments), in three types:

With dst word aligned:






With src word aligned:






With neither src not dst word aligned: