python - Split array into evenly-distributed chunks -
i'm looking efficient way split array chunks have similar histogram.
for example, given array:
l = np.array([1, 2, 3, 4, 1, 1, 1, 2, 3, 4, 55, 5, 5, 5, 5, 5, 3, 2, 2, 21, 2])
i obtain:
c1 = np.array([1, 4, 1, 5, 5, 3, 2]) c2 = np.array([2, 1, 3, 4, 5, 5, 2]) c3 = np.array([3, 1, 2, 55, 5, 2, 21])
not each chunk should have similar size , similar sum on given function f:
1. |sum(ci, f) - sum(cj, f)| < e, != j 2. |len(ci) - len(cj)| < e, != j
where
sum(c, f) = f(c[0]) + ... + f(c[len(c)])
edit: clarify intention of this. want parallelize process on list n
sub-processes cost has distributed among each sub-process evenly. cost of processing element in list function f
of integer @ same position in array l
, f computational complexity of process. example, f(i)=i^2
. so, want processes have same computational cost , not have processes finish while others last forever.
let's start weak assumption similar histograms defined in following rudimentary way: set of integers s1, histogram h(s1)
similar histogram h(s2)
of set s2 if sum(s1) = sum(s2)
.
generally after finding subsets s1, s2, ..., sn of array a such f(s1) = f(s2) = ... = f(sn)
, , under our assumptions f=sum
. unfortunately you've got k-partition problem, np-hard, , if you'll find efficient way (ie poly-time) that, requested, consequence p=np
proven true first on stackoverflow!
Comments
Post a Comment