This is version 14, the latest and best.


Note that my same (but not word) aligned speed is faster than uclinux's word aligned speed, once copy length is long enough.

Overview of copy length to 64 bytes:

Graphs show copy lengths from 0 to 256 bytes. Click any graph to see it zoomed in to show only 0 to 64 byte copy lengths, or click here to see all graphs showing 0-64 bytes.

(Raw data can be found here; raw data sorted into groups by type of alignment, here. Each raw data line is in the form:
dst address % 4, src address % 4, copy length, microseconds (usec) for C memcpy, usec for kernel memcpy, usec for uclinux memcpy, usec for my memcopy (version 14), the string "CKUE")

Word aligned copies (dst and src both are word aligned):

Word aligned copies (dst and src both are not word aligned, but have the same (mis) alignment):

Mis-aligned copies ( dst and src have different alignments), in three types:

With dst word aligned:

With src word aligned:

With neither src not dst word aligned: