Considers the input
b as packed 64-bit integers and
c as packed 8-bit integers.
Then groups 8 8-bit values from
cas indices into the the bits of the corresponding 64-bit integer.
It then selects these bits and packs them into the output.
Uses the writemask in k - elements are zeroed in the result if the corresponding mask bit is not set. Otherwise the computation result is written into the result.