Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
destination_passing_style [2021/12/08 10:24] awfdestination_passing_style [2022/09/26 08:34] (current) awf
Line 47: Line 47:
 </code> </code>
  
-And another version ''vadd_dps'' which takes a pre-sized result buffer+And another version ''vadd_dps'' which takes a pre-sized result buffer, which the caller must ensure is the correct size
  
 <code c++> <code c++>
 Vector vadd_dps(void* buf, Vector a, Vector b) Vector vadd_dps(void* buf, Vector a, Vector b)
 { {
-    double* out = (double*)buf; // Function body exactly as before, but no alloc in vadd.+    double* out = (double*)buf;
     vadd_blas(min(a.size,b.size), a.data, b.data, out);       vadd_blas(min(a.size,b.size), a.data, b.data, out);  
     std::fill(out + min(a.size, b.size), out + max(a.size, b.size), 0.0);     std::fill(out + min(a.size, b.size), out + max(a.size, b.size), 0.0);
Line 120: Line 120:
  
 <code c++> <code c++>
-void vadd_dps(Vectorout, Vector a, Vector b)+void vadd_dps(voidbuf, Vector a, Vector b)
 { {
    Vector tmp = vadd(a,b);    Vector tmp = vadd(a,b);
-   std::copy(tmp.data, tmp.data+tmp.size, out->data);+   std::copy(tmp.data, tmp.data+tmp.size, buf);
    gcdelete tmp.data;    gcdelete tmp.data;
 +   return Vector(tmp.size, buf);
 } }
 </code> </code>
  
-Againthe copy and allocations can be removed by DCE.  So, what's the point?   We seem to have just made ''vadd'' less efficient, and any gains we talk about could have been made by inlining ''vadd'' in main.+which again, can be easily optimized.  So, what's the point?   We seem to have just made ''vadd'' less efficient, and any gains we talk about could have been made by inlining ''vadd'' in main.
  
 The point is this: when compiling ''vadd'', we can easily compile and optimize ''vadd_size'' and ''vadd_dps'', without cross-module inlining.   When a caller sees ''vadd'', it can observe the existence of the ''_size'' and ''_dps'' variants, and know that stack-discipline allocation will yield more efficient code.  This will mean less stuff on the GC heap, and less GC overhead.  With careful subsetting of the source language (and with the introduction of an explicit ''gcnew'' or ''new/delete'' in the source), we can compile many sensible programs from a functional language such as F# to non-garbage-collected C. The point is this: when compiling ''vadd'', we can easily compile and optimize ''vadd_size'' and ''vadd_dps'', without cross-module inlining.   When a caller sees ''vadd'', it can observe the existence of the ''_size'' and ''_dps'' variants, and know that stack-discipline allocation will yield more efficient code.  This will mean less stuff on the GC heap, and less GC overhead.  With careful subsetting of the source language (and with the introduction of an explicit ''gcnew'' or ''new/delete'' in the source), we can compile many sensible programs from a functional language such as F# to non-garbage-collected C.
destination_passing_style.txt · Last modified: 2022/09/26 08:34 by awf
CC Attribution 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0