Matthew van Eerde's web log

  • Matthew van Eerde's web log

    Walking the IDeviceTopology tree to see audio driver settings

    • 2 Comments

    I've blogged before about using the IDeviceTopology API to poke around the internal structure exposed by audio drivers.

    In particular, given an audio endpoint, you can map out all the knobs and widgets of all the signal paths that feed into that endpoint (for playback) or out of it (for recording.)

    A version of this has been floating around the forums for some years, but I thought I would spiffy it up a little and make a blog post out of it.

    Pseudocode:

    for each endpoint exposed by IMMDeviceEnumerator:
        figure out whether it's a playback endpoint or a recording endpoint
        find the IPart which is
            the very end of the signal path for a playback endpoint, or
            the very beginning for a recording endpoint:
        if the IPart advertises jack information, query it
        if the IPart is a volume control, query the range and the current setting
        if the IPart is a mute control, query the current setting
        (we could continue here for various other kinds of controls, but I'm lazy)
        for all parts which
            feed in to the IPart for a playback endpoint, or
            are fed by the IPart for a recording endpoint:
            recursively process that IPart
        if the IPart is a connector:
            find the IPart on the other side of the connection
            process it recursively
            take care not to immediately hop back to this graph!

    Output:

    >devicetopology.exe
    eRender endpoint
    Name: Speakers (High Definition Audio Device)
    Endpoint ID: {0.0.0.00000000}.{351933db-e9b5-41df-8fa5-69c3d84531df}
      0x10000:
      {2}.\\?\hdaudio#func_01&ven_10ec&dev_0262&subsys_103c1309&rev_1002#4&12f2dd06&0&0001#{6994ad04-93ef-11d0-a3cc-00a0c9223196}\elineouttopo
        0x10001: Speakers
        Jacks: 1
          -- Jack 1 --
            ChannelMapping: 0x3
            Color: 0x00ff00 (red = 0, green = 255, blue = 0)
            Connection Type: 1 (eConnType3Point5mm)
            Geometric Location: 1 (eGeoLocRear)
            General Location: 0 (eGenLocPrimaryBox)
            Port Connection: 0 (ePortConnJack)
            IsConnected: No
          0x20001: Master Mute
            0x20000: Speakers
              0x10000:
              {0.0.1.00000000}.{6dd94b4a-90d7-4ddc-926d-69312eb53841}
                0x10000:

    eRender endpoint
    Name: Speakers (High Definition Audio Device)
    Endpoint ID: {0.0.0.00000000}.{6ce8bdf4-d22f-4ec7-a007-2228540b6705}
      0x10000:
      {2}.\\?\hdaudio#func_01&ven_10ec&dev_0262&subsys_103c1309&rev_1002#4&12f2dd06&0&0001#{6994ad04-93ef-11d0-a3cc-00a0c9223196}\elineout2topo
        0x10001: Speakers
        Jacks: 1
          -- Jack 1 --
            ChannelMapping: 0x3
            Color: 0x000000 (red = 0, green = 0, blue = 0)
            Connection Type: 3 (eConnTypeAtapiInternal)
            Geometric Location: 13 (eGeoLocATAPI)
            General Location: 0 (eGenLocPrimaryBox)
            Port Connection: 1 (ePortConnIntegratedDevice)
            IsConnected: Yes
          0x20001: Master Mute
          Mute node: MUTED
            0x20000: Speakers
            Channel 0 volume, -46.5 dB to 0 dB in steps of 1.5 dB: -6 dB
              0x10000:
              {2}.\\?\hdaudio#func_01&ven_10ec&dev_0262&subsys_103c1309&rev_1002#4&12f2dd06&0&0001#{6994ad04-93ef-11d0-a3cc-00a0c9223196}\elineout2wave
                0x10001:
                  0x20000: HD Audio line out
                    0x10000: PC Speaker

    eRender endpoint
    Name: Speakerphone (USB Audio Device)
    Endpoint ID: {0.0.0.00000000}.{916afdff-2ba3-4a7f-bca9-0bf75c47c946}
      0x10000:
      {2}.\\?\usb#vid_095d&pid_0005&mi_00#7&335bfb1&0&0000#{6994ad04-93ef-11d0-a3cc-00a0c9223196}\global
        0x10003: Speakerphone
          0x20009: DAC
            0x20008: Volume
            Channel 0 volume, -30.0039 dB to 0 dB in steps of 0.5 dB: -17.747 dB
              0x20007: Mute
              Mute node: NOT MUTED
                0x20006: Sample Rate Converter
                  0x10000:

    eCapture endpoint
    Name: Microphone (High Definition Audio Device)
    Endpoint ID: {0.0.1.00000000}.{38178e43-ed21-4731-b8f0-1330c3b1cd0a}
      0x10000:
      {2}.\\?\hdaudio#func_01&ven_10ec&dev_0262&subsys_103c1309&rev_1002#4&12f2dd06&0&0001#{6994ad04-93ef-11d0-a3cc-00a0c9223196}\emixedcapturetopo
        0x10001: Microphone
        Jacks: 1
          -- Jack 1 --
            ChannelMapping: 0x0
            Color: 0xff80c0 (red = 255, green = 128, blue = 192)
            Connection Type: 1 (eConnType3Point5mm)
            Geometric Location: 1 (eGeoLocRear)
            General Location: 0 (eGenLocPrimaryBox)
            Port Connection: 0 (ePortConnJack)
            IsConnected: No
          0x20001: Microphone
            0x20002: Microphone
              0x20003: Microphone Boost
                0x20000: Sum
                  0x10000:
                  {0.0.0.00000000}.{f7bf26d7-89a3-4049-8cd3-8148a6615565}
                    0x10000:

    eCapture endpoint
    Name: Speakerphone (USB Audio Device)
    Endpoint ID: {0.0.1.00000000}.{a21f51bc-5d0b-47af-93b6-ef9a968bb8ff}
      0x10000:
      {2}.\\?\usb#vid_095d&pid_0005&mi_00#7&335bfb1&0&0000#{6994ad04-93ef-11d0-a3cc-00a0c9223196}\global
        0x10002: Speakerphone
          0x20000: ADC
            0x20001: Mute
            Mute node: NOT MUTED
              0x20002: Volume
              Channel 0 volume, 0 dB to 32 dB in steps of 0.5 dB: 29 dB
                0x20003: SuperMix
                  0x20004:
                    0x20005: Sample Rate Converter
                      0x10001:

    eCapture endpoint
    Name: Microphone (High Definition Audio Device)
    Endpoint ID: {0.0.1.00000000}.{a9d04431-a99f-4ebe-b995-33f0d027559c}
      0x10000:
      {2}.\\?\hdaudio#func_01&ven_10ec&dev_0262&subsys_103c1309&rev_1002#4&12f2dd06&0&0001#{6994ad04-93ef-11d0-a3cc-00a0c9223196}\emixedcapturetopo
        0x10002: Microphone
        Jacks: 1
          -- Jack 1 --
            ChannelMapping: 0x0
            Color: 0x000000 (red = 0, green = 0, blue = 0)
            Connection Type: 1 (eConnType3Point5mm)
            Geometric Location: 2 (eGeoLocFront)
            General Location: 0 (eGenLocPrimaryBox)
            Port Connection: 0 (ePortConnJack)
            IsConnected: No
          0x20004: Microphone
            0x20005: Microphone
              0x20006: Microphone Boost
                0x20000: Sum
                  0x10000:
                  {0.0.0.00000000}.{f7bf26d7-89a3-4049-8cd3-8148a6615565}
                    0x10000:

    eCapture endpoint
    Name: Line In (High Definition Audio Device)
    Endpoint ID: {0.0.1.00000000}.{bbf7eb42-7674-4a35-ad02-b723ea047dac}
      0x10000:
      {2}.\\?\hdaudio#func_01&ven_10ec&dev_0262&subsys_103c1309&rev_1002#4&12f2dd06&0&0001#{6994ad04-93ef-11d0-a3cc-00a0c9223196}\emixedcapturetopo
        0x10003: Line In
        Jacks: 1
          -- Jack 1 --
            ChannelMapping: 0x0
            Color: 0x0000ff (red = 0, green = 0, blue = 255)
            Connection Type: 1 (eConnType3Point5mm)
            Geometric Location: 1 (eGeoLocRear)
            General Location: 0 (eGenLocPrimaryBox)
            Port Connection: 0 (ePortConnJack)
            IsConnected: No
          0x20007: Line In
            0x20008: Line In
              0x20000: Sum
                0x10000:
                {0.0.0.00000000}.{f7bf26d7-89a3-4049-8cd3-8148a6615565}
                  0x10000:

    And here's a screenshot of the Windows Driver Kit's Kernel Streaming Studio showing the topology exposed by the audio drivers on my machine:

     Source and x86/amd64 binaries attached.

  • Matthew van Eerde's web log

    Draining the WASAPI capture buffer fully

    • 0 Comments

    About six years ago I wrote a blog post about how to do WASAPI loopback capture. Since then, a few issues have come to light.

    One big issue is that we're using MMCSS registration directly. Nowadays the much-preferred approach is to use a Media Foundation work queue; yes, you can use a Media Foundation work queue even if you're doing WASAPI directly. The official WASAPI sample demonstrates how to use this approach.

    Another big issue is that the code isn't draining the buffer fully; it assumes there is only a single packet per engine period. Here's the capture loop as it stands:

    for (...) {
        Wait();
        if (data available) { ReadAPacket(); }
    }

    There are two reasons this is a problem. First, if you miss a period, you will never catch up. Second, the engine is free to write many small packets into the buffer if it likes.

    The code SHOULD look like this:

    for (...) {
        while (data available) { ReadAPacket(); }
        Wait();
    }

    Updated source and binaries attached.

  • Matthew van Eerde's web log

    More on audio buffer alignment requirements

    • 0 Comments

    I chatted in the past about how audio device alignment requirements impact the buffer size and the WASAPI alignment dance.

    There are three alignment requirements on audio buffers:

    1. The buffer size must be a multiple of WAVEFORMATEX.nBlockAlign. This allows individual audio frames to be copied around without worrying about them being cut in half and then having to glue them together at the end.
    2. KSPROPERTY_RTAUDIO_BUFFER must be a multiple of a page - that is, 4096 bytes. This allows multiply mapping the buffer into consecutive pages, which in turn simplifies memory copies where the buffer is the source or the destination. KSPROPERTY_RTAUDIO_BUFFER is for timer-driven streaming; there is an event-driven analog, KSPROPERTY_RTAUDIO_BUFFER_WITH_NOTIFICATION, which has no corresponding alignment requirement.
    3. HD Audio buffer allocations must be a multiple of 256 bytes. For timer-driven buffers, this applies to the whole buffer. For event-driven buffers, this applies to the sum of the "ping" and "pong" buffers, so the individual "ping" or "pong" buffer must be a multiple of 128 bytes.

    Consider a 5.1 16-bit 48 kHz stream playing to HD Audio hardware via KSPROPERTY_RTAUDIO_BUFFER. Where multiple alignment requirements apply, the effective alignment requirement is the least common multiple of all the applicable requirements.

    From the nBlockAlign requirement, the buffer must be a multiple of (6 * 16) / 8 bytes = 12 bytes.

    From the KSPROPERTY_RTAUDIO_BUFFER requirement, the buffer must be a multiple of PAGE_SIZE = 4096 bytes.

    From the HD Audio requirement, the buffer must be a multiple of 256 bytes (this is timer-driven, so we do not divide by 2.)

    In all, then, the buffer must be a multiple of LCM(12, 4096, 256) = 12288 bytes.

    Since WAVEFORMATEX.nAvgBytesPerSec = ((6 * 16) / 8) * 48000 = 576000 byte/sec, this corresponds to 12288 / 576000 * 1000 = 21.333 milliseconds.

  • Matthew van Eerde's web log

    Using the Speech API to convert speech to text

    • 0 Comments

    Some time ago I created a "listen.exe" tool which used SAPI's ISpRecoContext to listen to the microphone and dump any recognized text to the console.

    Today I had to debug an issue with SAPI reading from a .wav file, so I updated it to accept a listen.exe --file foo.wav argument; this consumes the audio in the .wav file instead of listening to the microphone.

    Pseudocode for the difference:

    CoCreate(ISpRecognizer);
    CoCreate(ISpStream);
    pSpStream->BindToFile(file);
    pSpRecognizer->SetInput(pSpStream);

    Also, we have to tell the ISpRecoContext that we're interested in SPEI_END_SR_STREAM events as well as SPEI_RECOGNITION events.

    Full source and binaries attached.

    A gotcha: the .wav file has to have a WAVEFORMATEX.wFormatTag = WAVE_FORMAT_PCM. If it's anything else, ISpRecoGrammar::SetDictationState fails with SPERR_UNSUPPORTED_FORMAT. Neither WAVE_FORMAT_IEEE_FLOAT nor (WAVE_FORMAT_EXTENSIBLE with SubFormat = KSDATAFORMAT_SUBTYPE_PCM) work.

  • Matthew van Eerde's web log

    World Cup tiebreaks: how the United States can survive Group G

    • 2 Comments

    With World Cup Group G standings being the way they are, I decided to read up on tiebreaks.

    Here are the FIFA rules (PDF)

    The relevant sections are:

    18.4.a: ... with three points for a win, one point for a draw and no points for a defeat (league format)

    ...

    18.6: In the league format, the ranking in each group is determined as follows:

    a) greatest number of points obtained in all group matches;
    b) goal difference in all group matches;
    c) greatest number of goals scored in all group matches.

    If two or more teams are equal on the basis of the above three criteria, their rankings shall be determined as follows:

    d) greatest number of points obtained in the group matches between the teams concerned;
    e) goal difference resulting from the group matches between the teams concerned;
    f) greater number of goals scored in all group matches between the teams concerned;
    g) the goals scored away from home count double between the teams concerned (if the tie is only between two teams).

    18.7 discusses play-off matches between teams that are still tied after all of 18.6.a-g are applied. As we'll see, there is no possibility of the United States being involved in a play-off.

    Let's start with 18.6.a. The current point standings in Group G are:

    Germany
    Four points, with a win and a tie
    United States
    Four points, with a win and a tie
    Ghana
    One point, with a tie and a defeat
    Portugal
    One point, with a tie and a defeat

    The remaining games are United States-Germany and Portugal-Ghana.

    If the United States-Germany game results in a tie, then the United States and Germany will have five points apiece, with a win and two ties. They will be the top two teams in the group, regardless of the outcome of the Portugal-Ghana game.

    If the Portugal-Ghana game results in a tie, then Ghana and Portugal will have two points apiece, with two ties and a defeat. Regardless of the results of the United States-Germany game, the United States and Germany will be the top two teams in the group.

    So we need only consider what happens if both games are decisive.

    The winner of the United States-Germany game will have seven points, with two wins and a tie. They are in on points.

    The loser of the Portugal-Ghana game will have one point, with one tie and two defeats. They are out on points.

    The remaining two teams (the loser of the United States-Germany game, and the winner of the Portugal-Ghana game) will have four points apiece, with a win, a tie, and a defeat.

    So in this scenario, 18.6.b will be invoked. At this point we have to look at the goal difference for the teams. (The goal difference for, say, Germany is just the number of goals Germany scored, minus the number of goals Germany allowed.) These are currently:

    Germany
    +4
    United States
    +1
    Ghana
    -1
    Portugal
    -4

    The raw numbers look good for Germany and the United States, and bad for Ghana and Portugal. But let's bear in mind that we are considering the scenario where Germany or the United States has just lost, and Ghana or Portugal has just won. How do you win? By scoring more goals than you allow! So the goal differential of the loser of the United States-Germany game will be lower than it is now, and the goal differential of the winner of the Portugal-Ghana game will be higher than it is now.

    So as a United States fan:

    1. Beat Germany and we're in on points.
    2. If we can't beat Germany, tie Germany and we're in on points.
    3. If Germany beats us, root for a Portugal-Ghana tie and we're in points.
    4. If Germany beats us, and the Portugal-Ghana game is also decisive, root for Portugal to win, and for the margin of victory for both games to add up to no higher than four, and we're in on goal differential.
      More explicitly:
      1. If Germany wins by one goal, our GD will be zero; Portugal can win by as many as three goals and we will be in on goal differential. If Portugal wins by five or more, we're out on goal differential. If Portugal wins by precisely four, see below.
      2. If Germany wins by two goals, our GD will be -1; Portugal can win by as many as two goals and we will be in on goal differential. If Portugal wins by four or more, we're out on goal differential. If Portugal wins by precisely three, see below.
      3. If Germany wins by three goals, our GD will be -2; Portugal can win by one goal and we will be in on goal differential. If Portugal wins by three or more, we're out on goal differential. If Portugal wins by precisely two, see below.
      4. If Germany wins by four goals, our GD will be -3. If Portugal wins by two or more, we're out on goal differential. If Portugal wins by precisely one, see below.
      5. If Germany wins by five goals or more, we're out on goal differential.
    5. If Germany beats us, and Ghana beats Portugal, root for the margin of victory in both games to add up to two.
      1. If Germany wins by one goal, our GD will be zero; if Ghana wins by one goal, their GD will also be zero. See below.
      2. If Germany wins by two goals or more, we're out on goal differential.

    What if Germany beats us, and Portugal beats Ghana, but the margin of victory of both games adds up to exactly four? Or what if Germany beats us and Ghana beats Portugal, and the margin of victory of both games adds up to precisely two - that is, each game is decided by a single goal?

    We can still get in on 18.6.c. At this point we have to look at the number of goals scored by each team. These are currently:

    Germany
    6
    United States
    4
    Ghana
    3
    Portugal
    2

    Note that the United States has one more goal than Ghana, and two more goals than Portugal. This is good. In the "goal differential" case, there was a strong relationship between winning and changing your goal differential. But there is only a weak relationship between winning and changing your raw goal count.

    For example, if Germany beats the United States 3-2, and Ghana beats Portugal 1-0, the United States will actually have pulled even farther ahead in the 18.6.c race!

    So:

    1. If Germany beats us, and Portugal beats Ghana, and the margin of victory of both games adds up to exactly four, root for Portugal's winning score to be no more than one goal higher than our losing score, and we're in on goals scored.
      1. If Portugal's winning score is three or more goals higher than our losing score, we're out on goals scored.
      2. If Portugal's winning score is precisely two goal higher than our losing score, see below.
    2. If Germany beats us, and Ghana beats Portugal, and the margin of victory of both games adds up to exactly two, root for Ghana's winning score to be no more than our losing score, and we're in on goals scored.
      1. If Ghana's winning score is two or more goals than our losing score, we're out on goals scored.
      2. If Ghana's winning score is precisely one goal higher than our losing score, see below.

    What if Germany beats us, and Portugal beats Ghana, and the margin of victory of the two games adds up to exactly four, and Portugal's winning score is precisely two goals more than our losing score? Or what if Germany beats us and Ghana beats Portugal, and the margin of victory of both games adds up to precisely two, and Ghana's winning score is precisely one goal higher than our losing score?

    Then we are in or out on 18.6.d-g. At this point we have to look at the results of the Ghana-United States and United States-Portugal matches. In each match, the first team is the home team and the second team is the away team.

    Ghana-United States
    1-2
    United States-Portugal
    2-2

    So:

    1. If Germany beats us, and Portugal beats Ghana, and the margin of victory of both games adds up to exactly four, and Portugal's winning score is precisely two goals higher than our losing score, we're out:
      1. We're still tied on direct match points (18.6.d)
      2. We're still tied on direct match goal differential (18.6.e)
      3. We're still tied on direct match goals scored (18.6.f)
      4. We're out on direct match goals scored with away goals counting double (18.6.g)
    2. If Germany beats us, and Ghana beats Portugal, and the margin of victory of both games adds up to exactly two, and Ghana's winning score is precisely one more than our losing score, then we're in on direct match points (18.6.d)

    Note there is no possibility of a playoff. Had the United States-Portugal game been a 0-0 tie instead of a 2-2 tie, that would have been a possibility, since double 0 is still 0.

  • Matthew van Eerde's web log

    Expressing a function f: GF(2⁸) → GF(2⁸) as a polynomial using a Lagrange polynomial

    • 0 Comments

    I talked about Rijndael in a couple of previous posts: Generating the Rijndael S-box, Efficient multiplication and division in GF(28), Sieving irreducible monic polynomials over a finite field, Addition and multiplication table for GF(22).

    I'm going to talk some more about it today.

    The Rijndael non-linear S-box S(x) is a composition of two invertible functions f(g(x)). Last time we showed how to generate these, and their inverses, as 256-entry tables.

    As Daemen and Rijmen point out, any function from a finite field to itself can be expressed as a polynomial. In fact, given a tabular form of the function, it is possible to generate the Lagrange polynomial and then simplify.

    They also give the polynomial for S(x):

    SRD[x] = 05·x255 + 09·x253 + F9·x251 + 25·x247 + F4·x239 + 01·x223 + B5·x191 + 8F·x127 + 63

    Well, let's check this.
    And while we're at it, let's find the polynomials for g(x), f(x) and even S-1(x) too.
    First of all, let's start with the Lagrange polynomial.

    Given a table of entries { (00, y00), (01, y01), ..., (ff, yff) }, there is a polynomial L(x) which gives the same output, namely:

    L(x) = Σi ∈ { 00, 01, ..., ff } yi pi(x)
    where pi(x) = Πj ∈ { 00, 01, ..., ff }, ji (x - j) / (i - j)

    Can we simplify this?
    Yes. Note that (i - j)-1 term varies over all the non-zero elements of the finite field. Since this is a field, every non-zero element has an inverse, which might or might not be itself.
    If the inverse is not itself, we can pair the two inverse elements together, and we get 01, which is the multiplicative identity, so we can ignore it.
    What are we left with? The product of those non-zero elements which are their own inverses.
    In the case of GF(28), there is only one such element, namely 01.
    Great! We can ignore the (i - j)-1 terms altogether:

    L(x) = Σi ∈ { 00, 01, ..., ff } yi Πj ∈ { 00, 01, ..., ff }, ji (x - j)

    What is this? A sum of 256 different polynomials.
    Each of these summands is a product of 255 polynomials of degree 1, and so is a polynomial of degree 255.
    But in GF(28), x255 = 01 for all x. So each of the summands is actually a polynomial of degree 254, or less.
    So the sum is also a polynomial of degree 254, or less; terms with an exponent >= 255 go away and just contribute to lower-order terms.

    Great. We have { (00, y00), (01, y01), ..., (ff, yff) } for f, g, S, and S-1. Let's plug them in and see what happens!

    I wrote a Perl script to do this; the script is attached. Here's the output. (It's quite slow; it takes about two and a half minutes to run.)

    g(x) = 01 x^254

    f(x) =
     63 + 05 x + 09 x^2 + f9 x^4 + 25 x^8 + f4 x^16 + 01 x^32 + b5 x^64 +
     8f x^128

    S(x) =
     63 + 8f x^127 + b5 x^191 + 01 x^223 + f4 x^239 + 25 x^247 + f9 x^251 + 09 x^253 +
     05 x^254

    S^(-1)(x) =
     52 + f3 x + 7e x^2 + 1e x^3 + 90 x^4 + bb x^5 + 2c x^6 + 8a x^7 +
     1c x^8 + 85 x^9 + 6d x^10 + c0 x^11 + b2 x^12 + 1b x^13 + 40 x^14 + 23 x^15 +
     f6 x^16 + 73 x^17 + 29 x^18 + d9 x^19 + 39 x^20 + 21 x^21 + cf x^22 + 3d x^23 +
     9a x^24 + 8a x^25 + 2f x^26 + cf x^27 + 7b x^28 + 04 x^29 + e8 x^30 + c8 x^31 +
     85 x^32 + 7b x^33 + 7c x^34 + af x^35 + 86 x^36 + 2f x^37 + 13 x^38 + 65 x^39 +
     75 x^40 + d3 x^41 + 6d x^42 + d4 x^43 + 89 x^44 + 8e x^45 + 65 x^46 + 05 x^47 +
     ea x^48 + 77 x^49 + 50 x^50 + a3 x^51 + c5 x^52 + 01 x^53 + 0b x^54 + 46 x^55 +
     bf x^56 + a7 x^57 + 0c x^58 + c7 x^59 + 8e x^60 + f2 x^61 + b1 x^62 + cb x^63 +
     e5 x^64 + e2 x^65 + 10 x^66 + d1 x^67 + 05 x^68 + b0 x^69 + f5 x^70 + 86 x^71 +
     e4 x^72 + 03 x^73 + 71 x^74 + a6 x^75 + 56 x^76 + 03 x^77 + 9e x^78 + 3e x^79 +
     19 x^80 + 18 x^81 + 52 x^82 + 16 x^83 + b9 x^84 + d3 x^85 + 38 x^86 + d9 x^87 +
     04 x^88 + e3 x^89 + 72 x^90 + 6b x^91 + ba x^92 + e8 x^93 + bf x^94 + 9d x^95 +
     1d x^96 + 5a x^97 + 55 x^98 + ff x^99 + 71 x^100 + e1 x^101 + a8 x^102 + 8e x^103 +
     fe x^104 + a2 x^105 + a7 x^106 + 1f x^107 + df x^108 + b0 x^109 + 03 x^110 + cb x^111 +
     08 x^112 + 53 x^113 + 6f x^114 + b0 x^115 + 7f x^116 + 87 x^117 + 8b x^118 + 02 x^119 +
     b1 x^120 + 92 x^121 + 81 x^122 + 27 x^123 + 40 x^124 + 2e x^125 + 1a x^126 + ee x^127 +
     10 x^128 + ca x^129 + 82 x^130 + 4f x^131 + 09 x^132 + aa x^133 + c7 x^134 + 55 x^135 +
     24 x^136 + 6c x^137 + e2 x^138 + 58 x^139 + bc x^140 + e0 x^141 + 26 x^142 + 37 x^143 +
     ed x^144 + 8d x^145 + 2a x^146 + d5 x^147 + ed x^148 + 45 x^149 + c3 x^150 + ec x^151 +
     1c x^152 + 3e x^153 + 2a x^154 + b3 x^155 + 9e x^156 + b7 x^157 + 38 x^158 + 82 x^159 +
     23 x^160 + 2d x^161 + 87 x^162 + ea x^163 + da x^164 + 45 x^165 + 24 x^166 + 03 x^167 +
     e7 x^168 + 19 x^169 + e3 x^170 + d3 x^171 + 4e x^172 + dd x^173 + 11 x^174 + 4e x^175 +
     81 x^176 + 91 x^177 + 91 x^178 + 59 x^179 + a3 x^180 + 80 x^181 + 92 x^182 + 7e x^183 +
     db x^184 + c4 x^185 + 20 x^186 + ec x^187 + db x^188 + 55 x^189 + 7f x^190 + a8 x^191 +
     c1 x^192 + 64 x^193 + ab x^194 + 1b x^195 + fd x^196 + 60 x^197 + 05 x^198 + 13 x^199 +
     2c x^200 + a9 x^201 + 76 x^202 + a5 x^203 + 1d x^204 + 32 x^205 + 8e x^206 + 1e x^207 +
     c0 x^208 + 65 x^209 + cb x^210 + 8b x^211 + 93 x^212 + e4 x^213 + ae x^214 + be x^215 +
     5f x^216 + 2c x^217 + 3b x^218 + d2 x^219 + 0f x^220 + 9f x^221 + 42 x^222 + cc x^223 +
     6c x^224 + 80 x^225 + 68 x^226 + 43 x^227 + 09 x^228 + 23 x^229 + c5 x^230 + 6d x^231 +
     1d x^232 + 18 x^233 + bd x^234 + 5e x^235 + 1b x^236 + b4 x^237 + 85 x^238 + 49 x^239 +
     bc x^240 + 0d x^241 + 1f x^242 + a6 x^243 + 6b x^244 + d8 x^245 + 22 x^246 + 01 x^247 +
     7a x^248 + c0 x^249 + 55 x^250 + 16 x^251 + b3 x^252 + cf x^253 + 05 x^254

    Daemen and Rijmen's polynomial rendition of S(x) is confirmed.

    Also note how simple g(x) is, and how complex S-1(x) is.

    Finally, I must confess that none of this is actually useful. The tabular form is much more convenient for applications. This is just a fun theoretical exercise.

    (Well, I had fun.)

  • Matthew van Eerde's web log

    Generating the Rijndael S-box

    • 0 Comments

    I talked about Rijndael in a couple of previous posts: Efficient multiplication and division in GF(28), Sieving irreducible monic polynomials over a finite field, Addition and multiplication table for GF(22).

    I'm going to talk some more about it today.

    One of the more interesting steps used in the Rijndael transformation is the non-linear S-box. This is a function S(a) that takes an element of GF(28) (which could be represented as a byte) and returns another one. It is invertible but non-linear.

    The spec defines S(a) in terms of two other invertible functions g(a) and f(a). In particular S(a) = f(g(a)). It follows that S-1(a) = g-1(f-1(a)).

    g(a) is the non-linear piece, and is quite simple:
    g(00) is defined as 00.
    g(a) = a-1 for all other a in GF(28). In particular, if a = 03b, then g(a) = 03255 - b.
    This is clearly invertible; in fact, it is its own inverse.

    Daemen and Rijmen seem to have been almost embarrassed by the simplicity of g so they introduced f. To quote from section 3.4.1:

    By definition, g has a very simple algebraic expression. This could allow algebraic manipulations that can be used to mount attacks such as interpolation attacks. Therefore, we built the S-box as the sequence of g and an invertible affine transformation f.

    I don't know if I buy the "simplicity" argument. It seems to me that if Rijndael, without f, is robust, then it's robust. And if you add f to a non-robust scheme, I don't understand how that makes it robust; I do see that it complicates analysis, but that seems like a drawback rather than an advantage.

    But I'll play along for now. What is f?

    b = f(a) is defined using the following matrix multiplication over GF(2) (each entry can be represented as a bit; a row or column as a byte.) Multiplication can be implemented as bitwise AND, and addition by bitwise XOR.

    In particular, note that f(00) = 63. So S(00) = f(g(00)) = f(00) = 63. So S-1(63) = 00.

    The reference implementation in the book uses hardcoded 256-byte lookup tables for S(a) and S-1(a). I wrote a Perl script which generates these lookup tables and prints them out. This script is attached.

    Here's its output, which matches the listing in the book:

    >perl -w s-box.pl
                              S(xy)
       | x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xa xb xc xd xe xf
    ---+------------------------------------------------
    0y | 63 7c 77 7b f2 6b 6f c5 30 01 67 2b fe d7 ab 76
    1y | ca 82 c9 7d fa 59 47 f0 ad d4 a2 af 9c a4 72 c0
    2y | b7 fd 93 26 36 3f f7 cc 34 a5 e5 f1 71 d8 31 15
    3y | 04 c7 23 c3 18 96 05 9a 07 12 80 e2 eb 27 b2 75
    4y | 09 83 2c 1a 1b 6e 5a a0 52 3b d6 b3 29 e3 2f 84
    5y | 53 d1 00 ed 20 fc b1 5b 6a cb be 39 4a 4c 58 cf
    6y | d0 ef aa fb 43 4d 33 85 45 f9 02 7f 50 3c 9f a8
    7y | 51 a3 40 8f 92 9d 38 f5 bc b6 da 21 10 ff f3 d2
    8y | cd 0c 13 ec 5f 97 44 17 c4 a7 7e 3d 64 5d 19 73
    9y | 60 81 4f dc 22 2a 90 88 46 ee b8 14 de 5e 0b db
    ay | e0 32 3a 0a 49 06 24 5c c2 d3 ac 62 91 95 e4 79
    by | e7 c8 37 6d 8d d5 4e a9 6c 56 f4 ea 65 7a ae 08
    cy | ba 78 25 2e 1c a6 b4 c6 e8 dd 74 1f 4b bd 8b 8a
    dy | 70 3e b5 66 48 03 f6 0e 61 35 57 b9 86 c1 1d 9e
    ey | e1 f8 98 11 69 d9 8e 94 9b 1e 87 e9 ce 55 28 df
    fy | 8c a1 89 0d bf e6 42 68 41 99 2d 0f b0 54 bb 16

                            S^(-1)(xy)
       | x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xa xb xc xd xe xf
    ---+------------------------------------------------
    0y | 52 09 6a d5 30 36 a5 38 bf 40 a3 9e 81 f3 d7 fb
    1y | 7c e3 39 82 9b 2f ff 87 34 8e 43 44 c4 de e9 cb
    2y | 54 7b 94 32 a6 c2 23 3d ee 4c 95 0b 42 fa c3 4e
    3y | 08 2e a1 66 28 d9 24 b2 76 5b a2 49 6d 8b d1 25
    4y | 72 f8 f6 64 86 68 98 16 d4 a4 5c cc 5d 65 b6 92
    5y | 6c 70 48 50 fd ed b9 da 5e 15 46 57 a7 8d 9d 84
    6y | 90 d8 ab 00 8c bc d3 0a f7 e4 58 05 b8 b3 45 06
    7y | d0 2c 1e 8f ca 3f 0f 02 c1 af bd 03 01 13 8a 6b
    8y | 3a 91 11 41 4f 67 dc ea 97 f2 cf ce f0 b4 e6 73
    9y | 96 ac 74 22 e7 ad 35 85 e2 f9 37 e8 1c 75 df 6e
    ay | 47 f1 1a 71 1d 29 c5 89 6f b7 62 0e aa 18 be 1b
    by | fc 56 3e 4b c6 d2 79 20 9a db c0 fe 78 cd 5a f4
    cy | 1f dd a8 33 88 07 c7 31 b1 12 10 59 27 80 ec 5f
    dy | 60 51 7f a9 19 b5 4a 0d 2d e5 7a 9f 93 c9 9c ef
    ey | a0 e0 3b 4d ae 2a f5 b0 c8 eb bb 3c 83 53 99 61
    fy | 17 2b 04 7e ba 77 d6 26 e1 69 14 63 55 21 0c 7d

    And here's the interesting part, the implementation of g(a) and f(a):

    sub g($) {
        my ($a) = shift;

        # g(0) we define to be 0
        return 0 unless $a;

        # otherwise g(a) = a^(-1)
        # a = (x + 1)^loga
        # so a^(-1) = (x + 1)^(-loga) = (x + 1)^(255 - loga)
        my $loga = $log_xplusone_of[$a];
        my $logb = 255 - $loga;
        $logb -= 255 if $logb >= 255;

        return $xplusone_to[$logb];
    }

    # f(a) = b is defined as follows:
    #
    # [ b7 ]   ( [ 1 1 1 1 1 0 0 0 ] [ a7 ] )   [ 0 ]
    # [ b6 ]   ( [ 0 1 1 1 1 1 0 0 ] [ a6 ] )   [ 1 ]
    # [ b5 ]   ( [ 0 0 1 1 1 1 1 0 ] [ a5 ] )   [ 1 ]
    # [ b4 ] = ( [ 0 0 0 1 1 1 1 1 ] [ a4 ] ) + [ 0 ]
    # [ b3 ]   ( [ 1 0 0 0 1 1 1 1 ] [ a3 ] )   [ 0 ]
    # [ b2 ]   ( [ 1 1 0 0 0 1 1 1 ] [ a2 ] )   [ 0 ]
    # [ b1 ]   ( [ 1 1 1 0 0 0 1 1 ] [ a1 ] )   [ 1 ]
    # [ b0 ]   ( [ 1 1 1 1 0 0 0 1 ] [ a0 ] )   [ 1 ]
    #
    # where the + is XOR
    sub f($) {
        my ($a) = @_;

        # start with the addition
        my $b = 0x63; # 0b01100011;

        # do the matrix multiplication
        # one matrix column at a time
        for (
            my ($c, $i) = (0x8f, 0x80); # 0b10001111, 0b10000000
            $i;
            $c = rotate_byte_right($c), $i >>= 1
        ) {
            # i is used to select a bit out of the a column
            # c is the matrix column which is multiplied by that bit
            # the resulting product influences the eventual b column

            # printf("%02x %02x\n", $c, $i);

            # if this bit in the a column is 0, all of the products are 0, so don't bother
            next unless $a & $i;

            # this bit in the a column is 1
            # so XOR b with the matrix column
            $b ^= $c;
        }

        return $b;
    }
  • Matthew van Eerde's web log

    Troubleshooting default audio device heuristics

    • 1 Comments

    In Windows 7 we published a white paper which shows how Windows chooses which audio device should be the default. This remains true for Windows 8 and Windows 8.1.

    The six factors that are considered for each device are:

    1. Jack detection capability
      1. Whether KSJACK_DESCRIPTION2.JackCapabilities has the JACKDESC2_PRESENCE_DETECT_CAPABILITY flag set
      2. USB audio devices have jack detection capability grandfathered in
    2. Form factor: PKEY_AudioEndpoint_FormFactor
    3. Kernel Streaming node type: PKEY_AudioEndpoint_JackSubType (this comes from KSPIN_DESCRIPTOR.Category field set by the driver)
    4. Bus type: HD Audio, Bluetooth, USB, etc.
    5. General location: IKsJackDescription's KSJACK_DESCRIPTION.GenLocation
    6. Geometric location: IKsJackDescrption's KSJACK_DESCRIPTION.GeoLocation

    Each factor has an associated preference list and a weight. Generally speaking, items that appear higher in the list above have priority over items that appear lower.

    I wrote a tool that uses public APIs to query each of these six factors for all devices on the system. Source and binaries are attached.

    Here's what it outputs on my system that has a built-in HD Audio device and an external USB device plugged in:

    >defaultaudiodevice.exe
    Render devices: 4
        Speakers (High Definition Audio Device)
        Id: {0.0.0.00000000}.{2089b0a1-893b-45fa-b6e8-ec84ee827c06}
        KSJACK_DESCRIPTION2.JackCapabilities: presence detection not supported
        PKEY_AudioEndpoint_FormFactor: 1 (Speakers)
        PKEY_AudioEndpoint_JackSubType: {DFF21CE1-F70F-11D0-B917-00A0C9223196} (KSNODETYPE_SPEAKER)
        PKEY_Device_EnumeratorName: HDAUDIO
        KSJACK_DESCRIPTION.GenLocation: 0 (eGenLocPrimaryBox)
        KSJACK_DESCRIPTION.GeoLocation: 13 (eGeoLocATAPI)

        Speakers (USB Audio)
        Id: {0.0.0.00000000}.{4bf6c88a-9b54-46cc-9eb0-91c5fa468ed6}
        IKsJackDescription2 not supported
        PKEY_AudioEndpoint_FormFactor: 1 (Speakers)
        PKEY_AudioEndpoint_JackSubType: {DFF21CE1-F70F-11D0-B917-00A0C9223196} (KSNODETYPE_SPEAKER)
        PKEY_Device_EnumeratorName: USB
        IKsJackDescription not supported

        Speakers (High Definition Audio Device)
        Id: {0.0.0.00000000}.{89482885-856d-4b7b-88ed-3ffa4944f14b}
        KSJACK_DESCRIPTION2.JackCapabilities: presence detection supported
        PKEY_AudioEndpoint_FormFactor: 1 (Speakers)
        PKEY_AudioEndpoint_JackSubType: {DFF21CE1-F70F-11D0-B917-00A0C9223196} (KSNODETYPE_SPEAKER)
        PKEY_Device_EnumeratorName: HDAUDIO
        KSJACK_DESCRIPTION.GenLocation: 0 (eGenLocPrimaryBox)
        KSJACK_DESCRIPTION.GeoLocation: 1 (eGeoLocRear)

        SPDIF Interface (USB Audio)
        Id: {0.0.0.00000000}.{c2c47ae1-abe1-43fa-825a-871bab831cf4}
        IKsJackDescription2 not supported
        PKEY_AudioEndpoint_FormFactor: 8 (SPDIF)
        PKEY_AudioEndpoint_JackSubType: {DFF21FE5-F70F-11D0-B917-00A0C9223196} (KSNODETYPE_SPDIF_INTERFACE)
        PKEY_Device_EnumeratorName: USB
        IKsJackDescription not supported

    Capture devices: 3
        Microphone (USB Audio)
        Id: {0.0.1.00000000}.{139b7b57-28d2-46de-9fce-702c0944c03f}
        IKsJackDescription2 not supported
        PKEY_AudioEndpoint_FormFactor: 4 (Microphone)
        PKEY_AudioEndpoint_JackSubType: {DFF21BE1-F70F-11D0-B917-00A0C9223196} (KSNODETYPE_MICROPHONE)
        PKEY_Device_EnumeratorName: USB
        IKsJackDescription not supported

        Line (USB Audio)
        Id: {0.0.1.00000000}.{67982570-bdea-4524-82ac-3d15d2b5ae89}
        IKsJackDescription2 not supported
        PKEY_AudioEndpoint_FormFactor: 2 (LineLevel)
        PKEY_AudioEndpoint_JackSubType: {DFF21FE3-F70F-11D0-B917-00A0C9223196} (KSNODETYPE_LINE_CONNECTOR)
        PKEY_Device_EnumeratorName: USB
        IKsJackDescription not supported

        SPDIF Interface (USB Audio)
        Id: {0.0.1.00000000}.{72b7efb6-c21a-4415-84e5-292d8a5b2b11}
        IKsJackDescription2 not supported
        PKEY_AudioEndpoint_FormFactor: 8 (SPDIF)
        PKEY_AudioEndpoint_JackSubType: {DFF21FE5-F70F-11D0-B917-00A0C9223196} (KSNODETYPE_SPDIF_INTERFACE)
        PKEY_Device_EnumeratorName: USB
        IKsJackDescription not supported

    Jack Detection Capability

    The first render device does not have jack presence detection, and the other three do. (Note the USB audio devices have grandfathered-in jack presence detection, even though the USB audio class driver does not support IKsJackDescription2.) This puts the first render device out of the running.

    Form Factor

    The fourth render device has form factor SPDIF, and the second and third have form factor Speakers. This puts the fourth render device out of the running, since the heuristics consider Speakers more attractive than SPDIF.

    Kernel Streaming node type

    The two remaining render devices both have form factor KSNODETYPE_SPEAKER, so they are tied.

    Bus type

    The second render device has bus type USB, and the third render device has bus type HD Audio. The default device heuristics prefer USB to HD Audio, so the second render device becomes the default audio render device.

  • Matthew van Eerde's web log

    Efficient multiplication and division in GF(2⁸)

    • 0 Comments

    I talked about Rijndael in a couple of previous posts: Sieving irreducible monic polynomials over a finite field, Addition and multiplication table for GF(2²)

    I'm going to talk some more about it today.

    Rijndael does a lot of arithmetic operations (addition and multiplication) on elements of GF(28)4.

    These are represented as polynomials b0 + b1 x + b2 x2 + b3 x3.

    The individual coefficients bi are elements of GF(28).

    These are themselves polynomials a0 + a1 x + ... + a7 x7 where the individual coefficients ai are bits (0 or 1).

    Multiplication bi bj is made into a closed operation by taking the modulus of the product under the reduction polynomial m = x8 + x4 + x3 + x + 1.

    The reduction polynomial is prime, which is important in establishing that this representation of the Galois field really is a field.

    For convenience we will represent the elements of GF(28) as hexadecimal numbers 0x00, 0x01, ... 0xfe, 0xff.

    By contrast, we will represent the elements of GF(28)4 as vectors (b0 b1 b2 b3); for example, (0x00 0x01 0x02 0x03).

    Multiplying vectors in GF(28)4 induces a bunch of multiplications in GF(28). It would be good to come up with a way to make this fast. Doing a polynomial reduction every time is not fast.

    Of course, one way to do this is to create a multiplication table with 256 rows and 256 columns, do each multiplication the slow way once, and compile in the results. This would require on the order of 64 kB. But there's a better way.

    The trick that is used in Daemen and Rijmen's implementation is to find a primitive element e in GF(28) so that the orbit of e (that is to say, the set {e0, e1, ..., e254}) spans all of GF(28) - {0x00}.

    If we find such an e, then we can save a cheat sheet which is almost as fast as the multiplication table, but requires storing only on the order of 256 bytes.

    Well, let's calculate the orbits of all the elements in order until we find a primitive element.

    Orbit(0x00) = { 0x000, 0x001, 0x002, ... } = { 0x01, 0x00, 0x00, ... } = { 0x00, 0x01 }. Nope.
    Orbit(0x01) = { 0x010, 0x011, 0x012, ... } = { 0x01, 0x01, 0x01, ... } = { 0x01 }. Even worse.
    Orbit(0x02) = { 0x020, 0x021, 0x022, ... } = { 0x01, 0x02, 0x04, ..., 0x8d }. This looks promising at first but we end up with an orbit of only 51 out of the necessary 255 elements (note that 51 divides 255:)

    0x020 = 0x01
    0x021 = 0x02
    0x022 = 0x04
    0x023 = 0x08
    0x024 = 0x10
    0x025 = 0x20
    0x026 = 0x40
    0x027 = 0x80
    0x028 = 0x1b
    0x029 = 0x36
    0x0210 = 0x6c
    0x0211 = 0xd8
    0x0212 = 0xab
    0x0213 = 0x4d
    0x0214 = 0x9a
    0x0215 = 0x2f
    0x0216 = 0x5e
    0x0217 = 0xbc
    0x0218 = 0x63
    0x0219 = 0xc6
    0x0220 = 0x97
    0x0221 = 0x35
    0x0222 = 0x6a
    0x0223 = 0xd4
    0x0224 = 0xb3
    0x0225 = 0x7d
    0x0226 = 0xfa
    0x0227 = 0xef
    0x0228 = 0xc5
    0x0229 = 0x91
    0x0230 = 0x39
    0x0231 = 0x72
    0x0232 = 0xe4
    0x0233 = 0xd3
    0x0234 = 0xbd
    0x0235 = 0x61
    0x0236 = 0xc2
    0x0237 = 0x9f
    0x0238 = 0x25
    0x0239 = 0x4a
    0x0240 = 0x94
    0x0241 = 0x33
    0x0242 = 0x66
    0x0243 = 0xcc
    0x0244 = 0x83
    0x0245 = 0x1d
    0x0246 = 0x3a
    0x0247 = 0x74
    0x0248 = 0xe8
    0x0249 = 0xcb
    0x0250 = 0x8d
    And then we circle back around to...
    0x0251 = 0x01

    Well, let's keep looking. What about 0x03?

    Orbit(0x03) = { 0x030, 0x031, 0x032, ... } = { 0x01, 0x03, 0x05, ..., 0xf6 }. Bingo, we hit all 255 non-0x00 elements!

    0x030 = 0x01
    0x031 = 0x03
    0x032 = 0x05
    0x033 = 0x0f
    0x034 = 0x11
    0x035 = 0x33
    0x036 = 0x55
    0x037 = 0xff
    0x038 = 0x1a
    0x039 = 0x2e
    0x0310 = 0x72
    0x0311 = 0x96
    0x0312 = 0xa1
    0x0313 = 0xf8
    0x0314 = 0x13
    0x0315 = 0x35
    0x0316 = 0x5f
    0x0317 = 0xe1
    0x0318 = 0x38
    0x0319 = 0x48
    0x0320 = 0xd8
    0x0321 = 0x73
    0x0322 = 0x95
    0x0323 = 0xa4
    0x0324 = 0xf7
    0x0325 = 0x02
    0x0326 = 0x06
    0x0327 = 0x0a
    0x0328 = 0x1e
    0x0329 = 0x22
    0x0330 = 0x66
    0x0331 = 0xaa
    0x0332 = 0xe5
    0x0333 = 0x34
    0x0334 = 0x5c
    0x0335 = 0xe4
    0x0336 = 0x37
    0x0337 = 0x59
    0x0338 = 0xeb
    0x0339 = 0x26
    0x0340 = 0x6a
    0x0341 = 0xbe
    0x0342 = 0xd9
    0x0343 = 0x70
    0x0344 = 0x90
    0x0345 = 0xab
    0x0346 = 0xe6
    0x0347 = 0x31
    0x0348 = 0x53
    0x0349 = 0xf5
    0x0350 = 0x04
    0x0351 = 0x0c
    0x0352 = 0x14
    0x0353 = 0x3c
    0x0354 = 0x44
    0x0355 = 0xcc
    0x0356 = 0x4f
    0x0357 = 0xd1
    0x0358 = 0x68
    0x0359 = 0xb8
    0x0360 = 0xd3
    0x0361 = 0x6e
    0x0362 = 0xb2
    0x0363 = 0xcd
    0x0364 = 0x4c
    0x0365 = 0xd4
    0x0366 = 0x67
    0x0367 = 0xa9
    0x0368 = 0xe0
    0x0369 = 0x3b
    0x0370 = 0x4d
    0x0371 = 0xd7
    0x0372 = 0x62
    0x0373 = 0xa6
    0x0374 = 0xf1
    0x0375 = 0x08
    0x0376 = 0x18
    0x0377 = 0x28
    0x0378 = 0x78
    0x0379 = 0x88
    0x0380 = 0x83
    0x0381 = 0x9e
    0x0382 = 0xb9
    0x0383 = 0xd0
    0x0384 = 0x6b
    0x0385 = 0xbd
    0x0386 = 0xdc
    0x0387 = 0x7f
    0x0388 = 0x81
    0x0389 = 0x98
    0x0390 = 0xb3
    0x0391 = 0xce
    0x0392 = 0x49
    0x0393 = 0xdb
    0x0394 = 0x76
    0x0395 = 0x9a
    0x0396 = 0xb5
    0x0397 = 0xc4
    0x0398 = 0x57
    0x0399 = 0xf9
    0x03100 = 0x10
    0x03101 = 0x30
    0x03102 = 0x50
    0x03103 = 0xf0
    0x03104 = 0x0b
    0x03105 = 0x1d
    0x03106 = 0x27
    0x03107 = 0x69
    0x03108 = 0xbb
    0x03109 = 0xd6
    0x03110 = 0x61
    0x03111 = 0xa3
    0x03112 = 0xfe
    0x03113 = 0x19
    0x03114 = 0x2b
    0x03115 = 0x7d
    0x03116 = 0x87
    0x03117 = 0x92
    0x03118 = 0xad
    0x03119 = 0xec
    0x03120 = 0x2f
    0x03121 = 0x71
    0x03122 = 0x93
    0x03123 = 0xae
    0x03124 = 0xe9
    0x03125 = 0x20
    0x03126 = 0x60
    0x03127 = 0xa0
    0x03128 = 0xfb
    0x03129 = 0x16
    0x03130 = 0x3a
    0x03131 = 0x4e
    0x03132 = 0xd2
    0x03133 = 0x6d
    0x03134 = 0xb7
    0x03135 = 0xc2
    0x03136 = 0x5d
    0x03137 = 0xe7
    0x03138 = 0x32
    0x03139 = 0x56
    0x03140 = 0xfa
    0x03141 = 0x15
    0x03142 = 0x3f
    0x03143 = 0x41
    0x03144 = 0xc3
    0x03145 = 0x5e
    0x03146 = 0xe2
    0x03147 = 0x3d
    0x03148 = 0x47
    0x03149 = 0xc9
    0x03150 = 0x40
    0x03151 = 0xc0
    0x03152 = 0x5b
    0x03153 = 0xed
    0x03154 = 0x2c
    0x03155 = 0x74
    0x03156 = 0x9c
    0x03157 = 0xbf
    0x03158 = 0xda
    0x03159 = 0x75
    0x03160 = 0x9f
    0x03161 = 0xba
    0x03162 = 0xd5
    0x03163 = 0x64
    0x03164 = 0xac
    0x03165 = 0xef
    0x03166 = 0x2a
    0x03167 = 0x7e
    0x03168 = 0x82
    0x03169 = 0x9d
    0x03170 = 0xbc
    0x03171 = 0xdf
    0x03172 = 0x7a
    0x03173 = 0x8e
    0x03174 = 0x89
    0x03175 = 0x80
    0x03176 = 0x9b
    0x03177 = 0xb6
    0x03178 = 0xc1
    0x03179 = 0x58
    0x03180 = 0xe8
    0x03181 = 0x23
    0x03182 = 0x65
    0x03183 = 0xaf
    0x03184 = 0xea
    0x03185 = 0x25
    0x03186 = 0x6f
    0x03187 = 0xb1
    0x03188 = 0xc8
    0x03189 = 0x43
    0x03190 = 0xc5
    0x03191 = 0x54
    0x03192 = 0xfc
    0x03193 = 0x1f
    0x03194 = 0x21
    0x03195 = 0x63
    0x03196 = 0xa5
    0x03197 = 0xf4
    0x03198 = 0x07
    0x03199 = 0x09
    0x03200 = 0x1b
    0x03201 = 0x2d
    0x03202 = 0x77
    0x03203 = 0x99
    0x03204 = 0xb0
    0x03205 = 0xcb
    0x03206 = 0x46
    0x03207 = 0xca
    0x03208 = 0x45
    0x03209 = 0xcf
    0x03210 = 0x4a
    0x03211 = 0xde
    0x03212 = 0x79
    0x03213 = 0x8b
    0x03214 = 0x86
    0x03215 = 0x91
    0x03216 = 0xa8
    0x03217 = 0xe3
    0x03218 = 0x3e
    0x03219 = 0x42
    0x03220 = 0xc6
    0x03221 = 0x51
    0x03222 = 0xf3
    0x03223 = 0x0e
    0x03224 = 0x12
    0x03225 = 0x36
    0x03226 = 0x5a
    0x03227 = 0xee
    0x03228 = 0x29
    0x03229 = 0x7b
    0x03230 = 0x8d
    0x03231 = 0x8c
    0x03232 = 0x8f
    0x03233 = 0x8a
    0x03234 = 0x85
    0x03235 = 0x94
    0x03236 = 0xa7
    0x03237 = 0xf2
    0x03238 = 0x0d
    0x03239 = 0x17
    0x03240 = 0x39
    0x03241 = 0x4b
    0x03242 = 0xdd
    0x03243 = 0x7c
    0x03244 = 0x84
    0x03245 = 0x97
    0x03246 = 0xa2
    0x03247 = 0xfd
    0x03248 = 0x1c
    0x03249 = 0x24
    0x03250 = 0x6c
    0x03251 = 0xb4
    0x03252 = 0xc7
    0x03253 = 0x52
    0x03254 = 0xf6
    ... and only now do we circle around to...
    0x03255 = 0x01

    This gives us a one-to-one mapping between the 255 powers 0 through 254 and the 255 non-zero bytes 0x01 through 0xff.

    Here are the perl scripts I used to generate these:

    use strict;

    my $m = 0x11b;

    my $a = 1;
    for (my $i = 0; ; $i++) {
        printf("0x02^%2d = 0x%02x\n", $i, $a);
        if ($i > 0 and $a == 1) { last; }
        $a <<= 1; # * 0x02
        if ($a > 0xff) { $a ^= $m; }
    }

    print "\n";

    $a = 1;
    for (my $i = 0; ; $i++) {
        printf("0x03^%3d = 0x%02x\n", $i, $a);
        if ($i > 0 and $a == 1) { last; }
        $a = ($a << 1) ^ $a; # * 0x03
        if ($a > 0xff) { $a ^= $m; }
    }

    Now that we have this orbit, we use it to construct two lookup tables:

    // 255 entries xPlusOne_ToThe[0] = 0x01 to xPlusOne_ToThe[254] = 0xf6
    xPlusOne_ToThe[] = { 0x01, 0x03, 0x05, ..., 0x52, 0xf6 };

    And the inverse mapping:

    // 255 entries logXPlusOne_Of[0x01] = 0 to logXPlusOne_Of[0xff] = 7
    logXPlusOne_Of[] = { UNDEFINED, 0, 25, 1, 50, ..., 112, 7 };

    Now that we have paid for these 512-ish bytes, we can do multiplication very cheaply:

    multiply(b1, b2) {
        if (0x00 == b1 || 0x00 == b2) { return 0x00; }

        logAns = logXPlusOne_Of[b1] + logXPlusOne_Of[b2];
        if (logAns >= 255) { logAns -= 255; }
        return xPlusOne_ToThe[logAns];
    }

  • Matthew van Eerde's web log

    A mental model for the Windows Phone AudioRoutingManager API

    • 0 Comments

    The Windows Phone SDK includes a Windows.Phone.Media.Devices.AudioRoutingManager API which I had occasion to use.

    The API allows apps that have communication audio streams (e.g., Voice over IP calls) to control whether the audio goes out over the earpiece, over the speakerphone, or over the Bluetooth headset. This might be done automatically, or might be used to power "Speakerphone" and "Bluetooth" buttons in the app UI.

    The starting point is a GetDefault() method which gives you the singleton AudioRoutingManager object.

    There are three ways to get information out of this object:

    1. A read-only AvailableAudioEndpoints property tells you the list of currently available audio outputs.
    2. A GetAudioEndpoint method tells you what the current audio output is.
    3. An AudioEndpointChanged callback tells you when either of the previous two things change.

    You can also tell the object to change something:

    1. SetAudioEndpoint(…) lets you tell Windows Phone where audio should come out, subject to some restrictions.

    There are two enumerated types used by these methods:

    1. AvailableAudioRoutingEndpoints, which is the type of the AvailableAudioEndpoints property. This is a "flags"-style (multi-valued) enum with the following values:
      • None
      • Earpiece
      • Speakerphone
      • Bluetooth
    2. AudioRoutingEndpoint, which is returned by GetAudioEndpoint and is the sole argument for SetAudioEndpoint. This is a single-valued enum with the following values:
      • Default
      • Earpiece
      • Speakerphone
      • Bluetooth
      • WiredHeadset
      • WiredHeadsetSpeakerOnly
      • BluetoothWithNoiseAndEchoCancellation

    At first I found this very confusing. SetAudioEndpoint takes an AudioRoutingEndpoint type, but what do I pass to it? And why does GetAudioEndpoint always tell me "Speakerphone?"

    After experimenting and chatting with the folks who own the API I was able to construct an internal mental model which made more sense to me.

    1. While communications audio is playing, the Phone has an audio routing policy. Imagine an AudioRoutingPolicy write-only property with the following values:
      • I'm flexible: play to the first available of { wired headset, Bluetooth device, earpiece }
      • Gimme Bluetooth: play to the first available of { Bluetooth device, wired headset, earpiece }
      • No Bluetooth: play to the first available of { wired headset, earpiece }
      • Speakerphone: play to the built-in speaker
    2. If you want to change this policy, the app needs to have either the ID_CAP_VOIP or ID_CAP_VOICEMAIL capability. (The documentation refers to an ID_CAP_AUDIOROUTING capability, but this does not exist.)
      Do:
          var audioRoutingManager = Windows.Phone.Media.Devices.AudioRoutingManager.GetDefault();
          audioRoutingManager.SetAudioEndpoint(x);

      where xis as follows:
      • x = AudioRoutingEndpoint.Bluetooth sets the policy to Gimme Bluetooth
      • x = AudioRoutingEndpoint.Earpiece sets the policy to No Bluetooth
      • x = AudioRoutingEndpoint.Speakerphone sets the policy to Speakerphone
    3. There is no direct way to set the policy to I'm flexible.
    4. There is no direct way to tell what the current value of the AudioRoutingPolicy is. You can sometimes guess, though, based on the value of GetAudioEndpoint and/or AvailableAudioEndpoints.
      • If GetAudioEndpoint is AudioRoutingEndpoint.Speakerphone, then the current policy must be Speakerphone.
      • If GetAudioEndpoint is AudioRoutingEndpoint.Earpiece and AvailableAudioEndpoints & AvailableAudioRoutingEndpoints.Bluetooth is set, then the current policy must be No Bluetooth.
      • If GetAudioEndpoint is AudioRoutingEndpoint.WiredHeadset and AvailableAudioEndpoints & AvailableAudioRoutingEndpoints.Bluetooth is set, then the current policy must be either I'm flexible or No Bluetooth.
      • If GetAudioEndpoint is AudioRoutingEndpoint.Bluetooth or AudioRoutingEndpoint.BluetoothWithNoiseAndEchoCancelation, then AvailableAudioEndpoints & AvailableAudioRoutingEndpoints.Bluetooth must be set, and the current policy must be either I'm flexible or Gimme Bluetooth.
    5. When there are no audio communications streams, the policy is undefined.
    6. When the number of audio communications streams goes from zero to one (perhaps as the result of a phone call or VoIP call), the policy is reset/defaulted to I'm flexible or No Bluetooth depending on the details. This means you shouldn't bother setting a policy until after your phone call starts playing audio.
    7. When a device is connected (a Bluetooth device is connected, or a wired headset is plugged in) the policy is reset to I'm flexible. This (usually) results in audio switching to the new device, which is usually what the user wants.
      (If a device is removed, on the other hand, the policy is not reset.)

    Here's a chart I put together on how the different states and policies interact:

    If a Bluetooth Hands-Free HF device is: Connected
    AvailableAudioEndpoints =
      Speakerphone | Earpiece | Bluetooth
    Not connected
    AvailableAudioEndpoints =
      Speakerphone | Earpiece
    I'm flexible audio routing policy is:
    This policy may be automatically invoked when:
      a call starts, or
      a device connects
    You can manually invoke it with:
      SetAudioEndpoint(Bluetooth)
    WiredHeadset or
    WiredHeadsetSpeakerOnly or
    Bluetooth or
    BluetoothWith...
    Depending on what is plugged in, and the capabilities of the device
    WiredHeadset or
    WiredHeadsetSpeakerOnly or
    Earpiece
    Depending on what is plugged in
    Gimme Bluetooth audio routing policy is:
    This policy may be automatically invoked when a call starts
    You can manually invoke it with:
      SetAudioEndpoint(Bluetooth)
    Bluetooth or
    BluetoothWith...
    Depending on the capabilities of the device
    WiredHeadset or
    WiredHeadsetSpeakerOnly or
    Earpiece
    Depending on what is plugged in
    No Bluetooth audio routing policy is:
    You can manually invoke this policy with:
      SetAudioEndpoint(Earpiece) or
      SetAudioEndpoint(Default)
    WiredHeadset or
    WiredHeadsetSpeakerOnly or
    Earpiece
    Depending on what is plugged in
    Speakerphone audio routing policy is:
    You can manually invoke this policy with:
      SetAudioEndpoint(Speakerphone)
    Speakerphone
    Invalid audio routing policies:
    The following calls are all errors:
      SetAudioEndpoint(WiredHeadset)
      SetAudioEndpoint(WiredHeadsetSpeakerOnly)
      SetAudioEndpoint(BluetoothWith...)
    N/A
    SetAudioEndpoint throws an exception

    Note that if a wired headset is plugged in, the app has no way to make audio come out of the earpiece. This is true regardless of whether Bluetooth is connected.

    It seems like much of my confusion resulted from a single enumerated type (AudioRoutingEndpoint) serving three purposes:

    1. Tell the app where audio is coming out (WiredHeadset vs. Earpiece)
    2. Tell the app what the capabilities of the current output are (Bluetooth vs. BluetoothWithNoiseAndEchoCancellation)
    3. Allow the app to control the audio routing policy (Default vs. Speakerphone)

    I think it would have been clearer to make the audio routing policy a different enumerated type from the current audio output or the available audio outputs. But with the "audio routing policy" mental model, it's not too bad.

  • Matthew van Eerde's web log

    Sieving irreducible monic polynomials over a finite field

    • 1 Comments

    Last time we talked about the addition and multiplication tables for GF(2²); GF(28) is relevant for the Rijndael encryption scheme.

    Joan Daemen and Vincent Rijmen use a representation of GF(28) where each element is a polynomial of the form b7x7 + b6x6 + b5x5 + b4x4 + b3x3 + b2x2 + b1x + b0, with the bi being bits. A polynomial is represented by a single byte. Addition is component-wise (modulo 2); two polynomials can be added with a single XOR operation. Multiplication is a little more complicated.

    Rijndael defines multiplication as normal polynomial multiplication, followed by taking a modulus using the reduction polynomial mx8 + x4 + x3 + x + 1. Last time we showed that, in GF(2²), x2 + x + 1 was used to reduce polynomial multiplication, and also that a necessary quality for a reduction polynomial is that it be irreducible/prime.

    Last time I hinted at the question: how do we show that m is irreducible? Well, one way is to do trial division of all polynomials up to degree 4.

    But that's no fun. Instead, let's write a script to generate all irreducible polynomials, and see if m is on it! The approach is very similar to the Sieve of Eratosthenes: generate a list of all the polynomials, then traverse it from the low end to the high end; circle each element that hasn't been crossed out, then cross out all multiples of that element.

    The first argument q is the modulus of the coefficients (in this case, 2) and the second argument d (in this case, 8) is how high the degree can go. Here is the output of the script:

    >perl polynomial-sieve.pl 2 8
    Finding monic irreducible polynomials of degree up to 8 and coefficients mod 2
    Generating all monic polynomials...
    Sieving out all reducible polynomials...
    1
    x
    x + 1
    x^2 + x + 1
    x^3 + x + 1
    x^3 + x^2 + 1
    x^4 + x + 1
    x^4 + x^3 + 1
    x^4 + x^3 + x^2 + x + 1
    x^5 + x^2 + 1
    x^5 + x^3 + 1
    x^5 + x^3 + x^2 + x + 1
    x^5 + x^4 + x^2 + x + 1
    x^5 + x^4 + x^3 + x + 1
    x^5 + x^4 + x^3 + x^2 + 1
    x^6 + x + 1
    x^6 + x^3 + 1
    x^6 + x^4 + x^2 + x + 1
    x^6 + x^4 + x^3 + x + 1
    x^6 + x^5 + 1
    x^6 + x^5 + x^2 + x + 1
    x^6 + x^5 + x^3 + x^2 + 1
    x^6 + x^5 + x^4 + x + 1
    x^6 + x^5 + x^4 + x^2 + 1
    x^7 + x + 1
    x^7 + x^3 + 1
    x^7 + x^3 + x^2 + x + 1
    x^7 + x^4 + 1
    x^7 + x^4 + x^3 + x^2 + 1
    x^7 + x^5 + x^2 + x + 1
    x^7 + x^5 + x^3 + x + 1
    x^7 + x^5 + x^4 + x^3 + 1
    x^7 + x^5 + x^4 + x^3 + x^2 + x + 1
    x^7 + x^6 + 1
    x^7 + x^6 + x^3 + x + 1
    x^7 + x^6 + x^4 + x + 1
    x^7 + x^6 + x^4 + x^2 + 1
    x^7 + x^6 + x^5 + x^2 + 1
    x^7 + x^6 + x^5 + x^3 + x^2 + x + 1
    x^7 + x^6 + x^5 + x^4 + 1
    x^7 + x^6 + x^5 + x^4 + x^2 + x + 1
    x^7 + x^6 + x^5 + x^4 + x^3 + x^2 + 1
    x^8 + x^4 + x^3 + x + 1
    x^8 + x^4 + x^3 + x^2 + 1
    x^8 + x^5 + x^3 + x + 1
    x^8 + x^5 + x^3 + x^2 + 1
    x^8 + x^5 + x^4 + x^3 + 1
    x^8 + x^5 + x^4 + x^3 + x^2 + x + 1
    x^8 + x^6 + x^3 + x^2 + 1
    x^8 + x^6 + x^4 + x^3 + x^2 + x + 1
    x^8 + x^6 + x^5 + x + 1
    x^8 + x^6 + x^5 + x^2 + 1
    x^8 + x^6 + x^5 + x^3 + 1
    x^8 + x^6 + x^5 + x^4 + 1
    x^8 + x^6 + x^5 + x^4 + x^2 + x + 1
    x^8 + x^6 + x^5 + x^4 + x^3 + x + 1
    x^8 + x^7 + x^2 + x + 1
    x^8 + x^7 + x^3 + x + 1
    x^8 + x^7 + x^3 + x^2 + 1
    x^8 + x^7 + x^4 + x^3 + x^2 + x + 1
    x^8 + x^7 + x^5 + x + 1
    x^8 + x^7 + x^5 + x^3 + 1
    x^8 + x^7 + x^5 + x^4 + 1
    x^8 + x^7 + x^5 + x^4 + x^3 + x^2 + 1
    x^8 + x^7 + x^6 + x + 1
    x^8 + x^7 + x^6 + x^3 + x^2 + x + 1
    x^8 + x^7 + x^6 + x^4 + x^2 + x + 1
    x^8 + x^7 + x^6 + x^4 + x^3 + x^2 + 1
    x^8 + x^7 + x^6 + x^5 + x^2 + x + 1
    x^8 + x^7 + x^6 + x^5 + x^4 + x + 1
    x^8 + x^7 + x^6 + x^5 + x^4 + x^2 + 1
    x^8 + x^7 + x^6 + x^5 + x^4 + x^3 + 1
    Done!

    Note that m is not just prime, it is the lowest of the monic 8-degree polynomials with coefficients mod 2.

    The script itself (in Perl) is attached. It makes no claims to being pretty or efficient.

    As a check, there is an elegant formula for the number of irreducible monic polynomials with coefficients in a finite field:

    N(q, n) = (Σd|n μ(d) qn/d)/n

    where μ(x) is the Möbius function.

    In particular:

    N(2, 1) = ((1)21/1)/1 = 2
    N(2, 2) = ((1)22/1 - (1)22/2)/2 = 1
    N(2, 3) = ((1)23/1 - (1)23/3)/3 = 2
    N(2, 4) = ((1)24/1 - (1)24/2 + (0)24/4)/4 = 3
    N(2, 5) = ((1)25/1 - (1)25/5)/5 = 6
    N(2, 6) = ((1)26/1 - (1)26/2 - (1)26/3 + (1)26/6)/6 = 9
    N(2, 7) = ((1)27/1 - (1)27/7)/7 = 18
    N(2, 8) = ((1)28/1 - (1)28/2 + (0)28/4 + (0)28/8)/8 = 30

    This checks out with the script.

  • Matthew van Eerde's web log

    Addition and multiplication table for GF(2²)

    • 5 Comments

    I'm reading Joan Daemen and Vincent Rijmen's book The Design of Rijndael and I'm giving myself a refresher course on group theory.

    Key to the encryption standard is the Galois field on 256 elements GF(28). A multiplication table of 256 elements by 256 elements quickly becomes a wall of text, so let's reason by analogy and look at GF(22).

    There are a number of ways to represent elements of the field; we'll start by representing them as polynomials with degree at most 1, and with integer coefficients modulo 2. There are four such polynomials: {0, 1, x, x + 1}.

    Here are the addition and multiplication tables:

    + 0 1 x x + 1
    0 0 1 x x + 1
    1 1 0 x + 1 x
    x x x + 1 0 1
    x + 1 x + 1 x 1 0
     
    0 1 x x + 1
    0 0 0 0 0
    1 0 1 x x + 1
    x 0 x x2 mod m x2 + x mod m
    x + 1 0 x + 1 x2 + x mod m x2 + 1 mod m

    Hold on. What's that funny-looking m?

    It's a "reduction polynomial" which brings the product back down to degree 1 or less. It has to be a polynomial of degree 2. There are four such polynomials: let's try each and see what we get.

    x2
    0 x
    x 1
    x2 + 1
    1 x + 1
    x + 1 0
    x2 + x
    x 0
    0 x + 1
    x2 + x + 1
    x + 1 1
    1 x

    Note that the first three polynomials all factor into products of lower-degree polynomials: x2 = x(x), x2 + 1 = (x + 1)(x + 1), x2 + x = x(x + 1). Only x2 + x + 1 is prime; and this prime reduction polynomial generates a complete multiplication table with no 0s. This is a necessary condition to be a field. Our final tables are:

    + 0 1 x x + 1
    0 0 1 x x + 1
    1 1 0 x + 1 x
    x x x + 1 0 1
    x + 1 x + 1 x 1 0
     
    0 1 x x + 1
    0 0 0 0 0
    1 0 1 x x + 1
    x 0 x x + 1 1
    x + 1 0 x + 1 1 x

    We can also write our elements in binary form: 0 => 00, 1 => 01, x => 10, and x + 1 => 11. In this notation our tables become:

    + 00 01 10 11
    00 00 01 10 11
    01 01 00 11 10
    10 10 11 00 01
    11 11 10 01 00
     
    00 01 10 11
    00 00 00 00 00
    01 00 01 10 11
    10 00 10 11 01
    11 00 11 01 10

    Rijndael works in GF(28) and uses a reduction polynomial of x8 + x4 + x3 + x + 1. They say this is prime. I sure hope so.

  • Matthew van Eerde's web log

    Microsoft etiquette: calendar appointments when going out of office

    • 1 Comments

    A common convention within Microsoft when going out of office is to create two calendar appointments in Outlook:

    1. An appointment which allows people who are trying to schedule a meeting with you to know that you're out of office at the time
      1. Attendees: just you
      2. Show As: Out of Office
    2. An informational appointment which allows people who work closely with you to know when you're going to be out of office
      1. Attendees: people who work closely with you (e.g., your immediate team)
      2. Show As: Free
      3. Request Responses: Off
      4. Allow New Time Proposals: Off
      5. Reminder: None

    I'm always forgetting one of the steps under 2., so I'm creating this blog post. Next time I go out of office I'll remember to check this post.

  • Matthew van Eerde's web log

    Why is 1 Pascal equal to 94 dB Sound Pressure Level? (1 Pa = 94 dB SPL)

    • 0 Comments

    Last time we talked about why a full-scale digital sine wave has a power measurement of -3.01 dB FS (Spoiler: because it's not a square wave.)

    This time we'll discuss why an atmospheric sound which generates a root-mean-square pressure of 1 Pascal has a power measurement 94 dB SPL.

    As before, dB is defined as 10 log10(PA2 / PB2) where PB is a reference level.

    Before, we had a digital measurement with an obvious ceiling: sample values of -1 and 1. So the reference point 0 dB FS was defined in terms of the signal with the greatest possible energy.

    In the analog domain, there isn't an obvious ceiling. We instead consider the floor - the quietest possible signal that is still audible by human ears.

    This is a rather wishy-washy definition, but the convention is to take PB = 20 μPa = 0.00002 Pa exactly.

    So our 0 dB SPL reference point is when PA = PB: 0 dB SPL = 10 log10(0.000022 / 0.000022) = 10 log10(1) = 10 (0) = 0.

    What if the pressure level is 1 Pascal? This is a quite loud sound, somewhere between heavy traffic and a jackhammer.

    1 Pa in dB SPL =

        10 log10(12 / PB2) =

        20 log10(1 / PB) =

        -20 log10(PB) =

        -20 log10(2(10-5)) =

        -20 (log10 2 + log10 10-5) =

        -20 ((log10 2) - 5) =

        100 - 20 log10 2 ≈ 93.9794 dB SPL

    So 1 Pa is actually a tiny bit less than 94 dB SPL; it's closer to 93.98 = (100 - 6.02) dB SPL.

  • Matthew van Eerde's web log

    Arbitrary HTML and JavaScript injection

    • 0 Comments
    1. Inject arbitrary HTML into this page:
      Any HTML you want

    2. Run arbitrary JavaScript on this page:
      4

  • Matthew van Eerde's web log

    Getting peak meters and volume settings for all apps and audio devices on the system

    • 3 Comments

    A few previous posts have touched on how to get peak meter readings on the device, and per-app

    Let's put it all together and write a single command-line tool which enumerates:
    1. All active audio devices (both playback and recording)
    2. Dumps the peak meter and volume levels for each device
    3. All active audio applications (audio sessions) per device
    4. Dumps the peak meter and volume levels for each audio session
    Note there is no API for enumerating individual streams within a session.
    Pseudocode:
    For each flow in (render, capture)
        For each device in IMMDeviceEnumerator::EnumAudioEndpoints(flow)
            Display the name of the device
            Get and display IAudioMeterInformation::GetPeakValue for the device
            Get and display IAudioEndpointVolume data for the device
            For each session in IAudioSessionManager2::GetSessionEnumerator
                Skip the session unless the state is "active"
                Get and display IAudioMeterInformation::GetPeakValue for the session
                Display session information
                Get and display ISimpleAudioVolume information
                Get and display IChannelAudioVolume information

    Sample output:

    >meters.exe
    -- Playback devices --
    Line out (High Definition Audio Device)
        Peak: 0.404736
        Mute: 0
        Volume range: 0% to 100% (-46.5 dB to 0 dB in steps of 1.5 dB)
        Master: 74% (-4.32831 dB)
        Channel 1 of 2: 74% (-4.32831 dB)
        Channel 2 of 2: 74% (-4.32831 dB)

        Active session #1
            Peak value: 0.240089
            Icon path:
            Display name:
            Grouping parameter: {98710e41-6535-4cf0-b9b3-4181a0b7103e}
            Process ID: 3496 (single-process)
            Session identifier: {0.0.0.00000000}.{b3d96927-adc1-4d0f-a83d-bda63ad41843}|\Device\HarddiskVolume1\Windows\System32\WWAHost.exe%b{00000000-0000-0000-0000-000000000000}
            Session instance identifier: {0.0.0.00000000}.{b3d96927-adc1-4d0f-a83d-bda63ad41843}|\Device\HarddiskVolume1\Windows\System32\WWAHost.exe%b{00000000-0000-0000-0000-000000000000}|1%b3496
            System sounds session: no
            Package full name: Microsoft.ZuneMusic_2.2.41.0_x64__8wekyb3d8bbwe
            Master volume: 1 (0 dB FS)
            Not muted
            Channel #1 volume: 1 (0 dB FS)
            Channel #2 volume: 1 (0 dB FS)

        Active session #2
            Peak value: 0.329753
            Icon path:
            Display name:
            Grouping parameter: {fc078096-d2fc-4883-8b0d-af4619266c02}
            Process ID: 6720 (multi-process)
            Session identifier: {0.0.0.00000000}.{b3d96927-adc1-4d0f-a83d-bda63ad41843}|#%b{726CF207-6167-47C4-A745-55691DBD84A1}
            Session instance identifier: {0.0.0.00000000}.{b3d96927-adc1-4d0f-a83d-bda63ad41843}|#%b{726CF207-6167-47C4-A745-55691DBD84A1}|1%b#
            System sounds session: no
            HWND: 0x00000000000D1390 Windows Media Player
            Master volume: 1 (0 dB FS)
            Not muted
            Channel #1 volume: 1 (0 dB FS)
            Channel #2 volume: 1 (0 dB FS)

    Internal speakers (High Definition Audio Device)
        Peak: 0
        Mute: 1
        Volume range: 0% to 100% (-46.5 dB to 0 dB in steps of 1.5 dB)
        Master: 65.7804% (-6 dB)
        Channel 1 of 1: 65.7804% (-6 dB)

    -- Recording devices --
    Microphone (High Definition Audio Device)
        Peak: 0.000274658
        Mute: 0
        Volume range: 0% to 100% (-34.5 dB to 12 dB in steps of 1.5 dB)
        Master: 84.7652% (6 dB)
        Channel 1 of 2: 84.7652% (6 dB)
        Channel 2 of 2: 84.7652% (6 dB)

        Active session #1
            Peak value: 0.000274658
            Icon path:
            Display name:
            Grouping parameter: {cee77f5a-d651-4392-8ffc-232c6eecdf51}
            Process ID: 8212 (single-process)
            Session identifier: {0.0.1.00000000}.{878a0979-89d6-43ec-9cff-e3f70dac2618}|\Device\HarddiskVolume1\Program Files\WindowsApps\Microsoft.WindowsSoundRecorder_6.3.9600.16384_x64__8wekyb3d8bbwe\soundrec.exe%b{00000000-0000-0000-0000-000000000000}
            Session instance identifier: {0.0.1.00000000}.{878a0979-89d6-43ec-9cff-e3f70dac2618}|\Device\HarddiskVolume1\Program Files\WindowsApps\Microsoft.WindowsSoundRecorder_6.3.9600.16384_x64__8wekyb3d8bbwe\soundrec.exe%b{00000000-0000-0000-0000-000000000000}|1%b8212
            System sounds session: no
            Package full name: Microsoft.WindowsSoundRecorder_6.3.9600.16384_x64__8wekyb3d8bbwe
            Master volume: 0.847652 (-1.43565 dB FS)
            Not muted
            Channel #1 volume: 1 (0 dB FS)
            Channel #2 volume: 1 (0 dB FS)

        Active session #2
            Peak value: 0.000274658
            Icon path:
            Display name:
            Grouping parameter: {c346e9e3-a37e-427b-a2be-1feb2c81b469}
            Process ID: 2608 (single-process)
            Session identifier: {0.0.1.00000000}.{878a0979-89d6-43ec-9cff-e3f70dac2618}|\Device\HarddiskVolume1\Windows\System32\SoundRecorder.exe%b{00000000-0000-0000-0000-000000000000}
            Session instance identifier: {0.0.1.00000000}.{878a0979-89d6-43ec-9cff-e3f70dac2618}|\Device\HarddiskVolume1\Windows\System32\SoundRecorder.exe%b{00000000-0000-0000-0000-000000000000}|1%b2608
            System sounds session: no
            HWND: 0x00000000004611FA Sound Recorder
            Master volume: 0.847652 (-1.43565 dB FS)
            Not muted
            Channel #1 volume: 1 (0 dB FS)
            Channel #2 volume: 1 (0 dB FS)

     Source and binaries attached.

  • Matthew van Eerde's web log

    shellproperty.exe v2: read all properties on a file; set properties of certain non-VT_LPWSTR types

    • 1 Comments

    I updated my toy app to set/read shell properties from the command line. New features:

    1. Read all the properties from a given file in one go.
    2. Recognize properties by their canonical name (if they have one.)
    3. Set a property to VT_EMPTY (removing it), or "VT_VECTOR | VT_LPWSTR", or VT_UI4, in addition to VT_LPWSTR.

    Usage:

    >shellproperty.exe
    shellproperty read [ <key> | all ] from <filename>
    shellproperty set <key> on <filename> to <vartype> <vartype-specific-arguments>

    <vartype>: VT_EMPTY | VT_LPWSTR | "VT_VECTOR | VT_LPWSTR" | VT_UI4

    Example of reading all properties from a file:

    >shellproperty read all from "I 01 Track 1.mp3" | sort
    {9E5E05AC-1936-4A75-94F7-4704B8B01923} 0: VT_BSTR I 01 Track 1.mp3
    {CFA31B45-525D-4998-BB44-3F7D81542FA4} 1: VT_LPWSTR MP3
    System.AppUserModel.ID:
    System.AppUserModel.ParentID:
    System.Audio.ChannelCount: 2 (stereo)
    System.Audio.EncodingBitrate: 320kbps
    System.Audio.Format: {00000055-0000-0010-8000-00AA00389B71}
    System.Audio.IsVariableBitRate: No
    System.Audio.PeakValue: 23841
    System.Audio.SampleRate: 44 kHz
    System.Audio.SampleSize: 16 bit
    System.Audio.StreamNumber: 0
    System.Author: Unknown artist
    System.ComputerName: MATEER-D (this PC)
    System.ContentType: audio/mpeg
    System.DateAccessed: 9/3/2013 5:55 PM
    System.DateCreated: 9/3/2013 5:55 PM
    System.DateImported: 9/3/2013 5:55 PM
    System.DateModified: 9/24/2013 3:21 PM
    System.Document.DateCreated: 9/3/2013 5:55 PM
    System.Document.DateSaved: 9/24/2013 3:21 PM
    System.DRM.IsProtected: No
    System.ExpandoProperties:
    System.FileAttributes: A
    System.FileAttributesDisplay:
    System.FileExtension: .mp3
    System.FileName: I 01 Track 1.mp3
    System.FileOwner: REDMOND\mateer
    System.FilePlaceholderStatus: 7
    System.IsFolder: Files
    System.IsShared: No
    System.ItemAuthors: Unknown artist
    System.ItemDate: 9/3/2013 5:55 PM
    System.ItemFolderNameDisplay: Les Misérables (concept album)
    System.ItemFolderPathDisplay: C:\music\Claude-Michel Schönberg & Alain Boublil\Les Misérables (concept album)
    System.ItemFolderPathDisplayNarrow: Les Misérables (concept album) (C:\music\Claude-Michel Schönberg & Alain Boublil)
    System.ItemName: I 01 Track 1.mp3
    System.ItemNameDisplay: I 01 Track 1.mp3
    System.ItemNameDisplayWithoutExtension: I 01 Track 1
    System.ItemParticipants: Unknown artist
    System.ItemPathDisplay: C:\music\Claude-Michel Schönberg & Alain Boublil\Les Misérables (concept album)\I 01 Track 1.mp3
    System.ItemPathDisplayNarrow: I 01 Track 1 (C:\music\Claude-Michel Schönberg & Alain Boublil\Les Misérables (concept album))
    System.ItemType: MP3 File
    System.ItemTypeText: MP3 File
    System.Kind: Music
    System.KindText: Music
    System.Link.TargetExtension:
    System.Link.TargetParsingPath:
    System.Link.TargetSFGAOFlags:
    System.Link.TargetSFGAOFlagsStrings:
    System.Media.AverageLevel: 4219
    System.Media.ClassPrimaryID: {D1607DBC-E323-4BE2-86A1-48A42A28441E}
    System.Media.ClassSecondaryID: {00000000-0000-0000-0000-000000000000}
    System.Media.CollectionGroupID: {3B02CC9D-BE3E-43A4-81AA-DC23DFD20083}
    System.Media.CollectionID: {3B02CC9D-BE3E-43A4-81AA-DC23DFD20083}
    System.Media.ContentID: {3780156C-B516-4897-B6AC-CB632A0CA4A5}
    System.Media.DlnaProfileID: MP3
    System.Media.Duration: 00:04:47
    System.Media.MCDI: E+96+54E9+98AD+A23C+DBD5+F62C+11889+15B50+170F9+1C1EC+1E01E+221A7+2916C+2C6EB+2F21A
    System.Media.MetadataContentProvider: AMG
    System.Media.Publisher: Colosseum
    System.Media.UniqueFileIdentifier: AMGt_id=T 987037;AMGp_id=P 1857378;AMGa_id=R 189777;X_id={9D0F0F00-0500-11DB-89CA-0019B92A3933};XA_id={51E50200-0400-11DB-89CA-0019B92A3933};XAP_id={6357088C-778C-11DC-9403-0019B9B20868}
    System.Media.Year: 1989
    System.MIMEType: audio/mpeg
    System.Music.AlbumArtist: Various Artists
    System.Music.AlbumID: Various Artists - Les Miserables - French Concept Album: 1 of 2
    System.Music.AlbumTitle: Les Miserables - French Concept Album: 1 of 2
    System.Music.Artist: Unknown artist
    System.Music.Composer: Alain Boublil; Claude-Michel Schönberg
    System.Music.DisplayArtist: Various Artists
    System.Music.Genre: Unknown genre
    System.Music.PartOfSet: 1/1
    System.Music.TrackNumber: 1
    System.NetworkLocation:
    System.NotUserContent: No
    System.OfflineAvailability: Available offline
    System.OfflineStatus:
    System.ParsingName: I 01 Track 1.mp3
    System.ParsingPath: C:\music\Claude-Michel Schönberg & Alain Boublil\Les Misérables (concept album)\I 01 Track 1.mp3
    System.PerceivedType: Audio
    System.SFGAOFlags: 1077936503
    System.SharedWith:
    System.ShareScope: music\Claude-Michel Schönberg & Alain Boublil\Les Misérables (concept album)
    System.SharingStatus: Not shared
    System.Shell.SFGAOFlagsStrings: filesys; stream
    System.Size: 10.9 MB
    System.ThumbnailCacheId: 16520045390528741485
    System.Title: Track 1
    System.VolumeId: {14FF6E9D-14F5-11E3-824C-806E6F6E6963}
    System.ZoneIdentifier: 0

    Example of updating a file:

    >type _fixup.bat
    @echo off

    for /f "usebackq delims=" %%f in (`dir /s /b "I *.mp3"`) do (
        shellproperty set System.Music.AlbumTitle on "%%f" to VT_LPWSTR "Madama Butterfly - Sinopoli / Freni: 1 of 3"
    )

    for /f "usebackq delims=" %%f in (`dir /s /b "II *.mp3"`) do (
        shellproperty set System.Music.AlbumTitle on "%%f" to VT_LPWSTR "Madama Butterfly - Sinopoli / Freni: 2 of 3"
    )

    for /f "usebackq delims=" %%f in (`dir /s /b "III *.mp3"`) do (
        shellproperty set System.Music.AlbumTitle on "%%f" to VT_LPWSTR "Madama Butterfly - Sinopoli / Freni: 3 of 3"
    )

    Source and binaries (x86 and amd64) attached.

  • Matthew van Eerde's web log

    Sample app for RECT functions

    • 0 Comments

    Riffing on Raymond Chen's post today about SubtractRect I threw together a toy app which demonstrates three rectangle functions: UnionRect, IntersectRect, and SubtractRect.

    Usage:

    >rects.exe
    rects.exe
        union     (left1 top1 right1 bottom1) (left2 top2 right2 bottom2) |
        intersect (left1 top1 right1 bottom1) (left2 top2 right2 bottom2) |
        subtract  (left1 top1 right1 bottom1) (left2 top2 right2 bottom2)

    Sample output:

    >rects.exe union (2 2 6 6) (4 4 8 8)
          (left = 2; top = 2; right = 6; bottom = 6)
    union (left = 4; top = 4; right = 8; bottom = 8)
        = (left = 2; top = 2; right = 8; bottom = 8)

    Source and binaries (amd64 and x86) attached.

    Still no pictures though.

    Exercise: implement BOOL SymmetricDifferenceRect(_Out_ RECT *C, const RECT *A, const RECT *B).

  • Matthew van Eerde's web log

    shellproperty.exe - set/read string properties on a file from the command line

    • 2 Comments

    Yesterday Raymond Chen blogged a "Little Program" which could edit audio metadata. As it happens, I have a similar tool I threw together which accepts a property key and a string property value to update a property, or can read a string or string-vector property.

    Usage:

    >shellproperty
    shellproperty read <key> from <filename>
    shellproperty set <key> to <string> on <filename>

    Here's an example _fixup.bat script I use to set audio metadata on my copy of Giuseppe Sinopoli's recording of Madama Butterfly, to help distinguish it from other recordings of the same opera that I have.

    @echo off
    dir /s /b "I *.mp3" | xargs /addquotes shellproperty set PKEY_Music_AlbumTitle to "Madama Butterfly - Sinopoli / Freni: 1 of 3" on
    dir /s /b "II *.mp3" | xargs /addquotes shellproperty set PKEY_Music_AlbumTitle to "Madama Butterfly - Sinopoli / Freni: 2 of 3" on
    dir /s /b "III *.mp3" | xargs /addquotes shellproperty set PKEY_Music_AlbumTitle to "Madama Butterfly - Sinopoli / Freni: 3 of 3" on

    Source and amd64/x86 binaries attached, but in substance it's very similar to Raymond's "Little Program".

    Possible future improvements:

    1. When setting, allow specifying a vartype on the command line.
    2. Allow specifying a property key by fmtid and pid.
    3. Handle more vartypes for displaying properties.
    4. Allow dumping all properties on a given file.
  • Matthew van Eerde's web log

    Even if someone's signaling right, they still have the right of way

    • 0 Comments

    I was driving to work this morning and I had an experience which vindicated my paranoia, and may even have passed it on to someone else.

    I was heading East on NE 80th St approaching 140th Ave NE in Redmond. This is a two-way stop; drivers on 140th have the right of way and do not stop. Drivers on 80th (me) have to stop.

    I came to a full stop and signaled right (I wanted to head South on 140th). A driver (let's call him Sam) pulled up behind me, also signaling right. There were three cars heading South on 140th, all of them signaling right (they wanted to head West on 80th).

    At this point I had a conversation with myself that went something like this.

    Well, Matt, you could turn right now. All those cars are turning right, so they won't hit you.
    But wait, Matt. Those cars have the right of way. Sure they're signaling right. But that doesn't mean they'll actually turn right.
    Yup, you're right, Matt. Better to wait to see what actually happens.

    So I waited, and sure enough, all three cars actually turned right. So I suppose I could have gone. And more cars were feeding in to 140th from Redmond Way. And all of these cars were signaling right. And one was a school bus.

    At this point Sam (remember Sam?) got impatient and honked his horn. This shocked me a little.

    I imagine anyone who is from New York or Los Angeles is shaking their heads at me now. Not for waiting, but for being shocked. "He honked his horn? So what?"

    (This is a cultural difference. In New York or Los Angeles, if you're waiting at a red light, you will get honked at as soon as the other guy's light turns yellow. But in Washington, the guy behind you will calmly wait through two full greens, then politely knock on your window and ask if everything is OK.)

    I trust the school bus even less than the cars, so I let the school bus go.

    The car behind the school bus is a minivan. He's signaling right, too. But I let him go as well... and he goes straight!

    Behind the minivan, there's enough of a gap that I feel comfortable pulling out, so I do. And Sam pulls up to the line.

    As I'm cruising down 140th, I glance in my rear-view mirror. I see a line of cars coming down 140th to the intersection I just left, all signaling right...

    ... and I see my friend Sam...

    ... patiently waiting.

    May the Force be with you, Sam.

  • Matthew van Eerde's web log

    Getting the package full name of a Windows Store app, given the process ID

    • 0 Comments

    Last time I talked about enumerating audio sessions and showed an example which listed several Desktop apps and one Windows Store app.

    Session instance identifier: {0.0.0.00000000}.{125eeed2-3cd2-48cf-aac9-8ae0157564ad}|\Device\HarddiskVolume1\Windows\System32\WWAHost.exe%b{00000000-0000-0000-0000-000000000000}|1%b11812

    It's possible to guess that this is a Windows Store app by the presence of the WWAHost.exe string in the session instance identifier. Don't rely on this, though; the session identifiers are opaque strings, and their formula can change at any time.

    We were able to get some additional information on the Desktop apps by enumerating their top-level windows and reading the window text. But how do we get more information on the Windows Store app? And how do we even know it's a Windows Store app without cracking the session identifier?

    By using the Application Model APIs - for example, GetPackageFullName.

    Pseudocode:

    ... get a process ID...

    OpenProcess(PROCESS_QUERY_LIMITED_USER_INFORMATION, FALSE, pid);

    GetPackageFullName(...)

    if APPMODEL_ERROR_NO_PACKAGE then the process has no associated package and is therefore not a Windows Store app.

    Updated sample output:

    -- Active session #4 --
    Icon path:
    Display name:
    Grouping parameter: {8dbd87b0-9fce-4c27-b7ff-4b20b0dae1a3}
    Process ID: 11644 (single-process)
    Session identifier: {0.0.0.00000000}.{125eeed2-3cd2-48cf-aac9-8ae0157564ad}|\Device\HarddiskVolume1\Windows\System32\WWAHost.exe%b{00000000-0000-0000-0000-000000000000}
    Session instance identifier: {0.0.0.00000000}.{125eeed2-3cd2-48cf-aac9-8ae0157564ad}|\Device\HarddiskVolume1\Windows\System32\WWAHost.exe%b{00000000-0000-0000-0000-000000000000}|1%b11644
    System sounds session: no
    Peak value: 0.395276
    Package full name: Microsoft.ZuneMusic_2.0.132.0_x64__8wekyb3d8bbwe

    Source and binaries attached.

  • Matthew van Eerde's web log

    More on IAudioSessionControl and IAudioSessionControl2, plus: how to log a GUID

    • 0 Comments

    A while back I blogged about using IAudioSessionControl and IAudioSessionControl2 to get a list of active sessions, and then using IAudioMeterInformation to see what the amplitude level of the audio being played from each session was.

    I decided to go back and push this a little further and see what information there was to dig out. Pseudocode:

    CoCreate(IMMDeviceEnumerator)
    MMDevice = IMMDeviceEnumerator::GetDefaultAudioEndpoint(...)
    AudioSessionManager2 = MMDevice::Activate(...)
    AudioSessionEnumerator = AudioSessionManager2::GetSessionEnumerator()

    for each session in AudioSessionEnumerator {
        AudioSessionControl = AudioSessionEnumerator::GetSession(...)
        if (AudioSessionStateActive != AudioSessionControl::GetState()) { continue; }

        AudioSessionControl::GetIconPath (usually blank)
        AudioSessionControl::GetDisplayName (usually blank)
        AudioSessionControl::GetGroupingParam

        AudioSessionControl2 = AudioSessionControl::QueryInterface(...)
        AudioSessionControl2::GetSessionIdentifier (treat this as an opaque string)
        AudioSessionControl2::GetSessionInstanceIdentifier (treat this as an opaque string)
        AudioSessionControl2::GetProcessId (some sessions span multiple processes)
        AudioSessionControl2::IsSystemSoundsSession

        AudioMeterInformation = AudioSessionControl::QueryInterface(...)
        AudioMeterInformation::GetPeakValue

        for each top level window in the process pointed to by AudioSessionControl2::GetProcessId {
            Use WM_GETTEXTLENGTH and WM_GETTEXT to get the window text, if any
        }
    }

    Here's the output of the new version of meters.exe.

    >meters.exe
    -- Active session #1 --
        Icon path:
        Display name:
        Grouping parameter: {a204c0ad-03c4-4754-8e8f-92843aacb1fa}
        Process ID: 11812 (single-process)
        Session identifier: {0.0.0.00000000}.{125eeed2-3cd2-48cf-aac9-8ae0157564ad}|\Device\HarddiskVolume1\Windows\System32\WWAHost.exe%b{00000000-0000-0000-0000-000000000000}
        Session instance identifier: {0.0.0.00000000}.{125eeed2-3cd2-48cf-aac9-8ae0157564ad}|\Device\HarddiskVolume1\Windows\System32\WWAHost.exe%b{00000000-0000-0000-0000-000000000000}|1%b11812
        System sounds session: no
        Peak value: 0.2837

    -- Active session #2 --
        Icon path:
        Display name:
        Grouping parameter: {a2e2e0f5-81bb-407e-b701-f4f3695f9dac}
        Process ID: 15148 (single-process)
        Session identifier: {0.0.0.00000000}.{125eeed2-3cd2-48cf-aac9-8ae0157564ad}|\Device\HarddiskVolume1\Program Files (x86)\Internet Explorer\iexplore.exe%b{00000000-0000-0000-0000-000000000000}
        Session instance identifier: {0.0.0.00000000}.{125eeed2-3cd2-48cf-aac9-8ae0157564ad}|\Device\HarddiskVolume1\Program Files (x86)\Internet Explorer\iexplore.exe%b{00000000-0000-0000-0000-000000000000}|1%b15148
        System sounds session: no
        Peak value: 0.428589
        HWND: 0x0000000001330B12
        HWND: 0x0000000000361CA2
        HWND: 0x00000000019A07A8
        HWND: 0x0000000001411BF2
        HWND: 0x0000000000B60706
        HWND: 0x000000000231165A
        HWND: 0x0000000002631472
        HWND: 0x0000000000441D94

    -- Active session #3 --
        Icon path:
        Display name:
        Grouping parameter: {e191c91d-dc24-468d-b542-0d5f12ce8c48}
        Process ID: 2324 (multi-process)
        Session identifier: {0.0.0.00000000}.{125eeed2-3cd2-48cf-aac9-8ae0157564ad}|#%b{48FF2ED2-2CE8-40FB-AEF7-31FEFDBA7EF2}
        Session instance identifier: {0.0.0.00000000}.{125eeed2-3cd2-48cf-aac9-8ae0157564ad}|#%b{48FF2ED2-2CE8-40FB-AEF7-31FEFDBA7EF2}|1%b#
        System sounds session: no
        Peak value: 0.294137
        HWND: 0x0000000002900C86 Windows Media Player

    -- Active session #4 --
        Icon path: @%SystemRoot%\System32\AudioSrv.Dll,-203
        Display name: @%SystemRoot%\System32\AudioSrv.Dll,-202
        Grouping parameter: {e7d6e107-ca03-4660-a067-1a1f3dc1619c}
        Process ID: 0 (multi-process)
        Session identifier: {0.0.0.00000000}.{125eeed2-3cd2-48cf-aac9-8ae0157564ad}|#%b{A9EF3FD9-4240-455E-A4D5-F2B3301887B2}
        Session instance identifier: {0.0.0.00000000}.{125eeed2-3cd2-48cf-aac9-8ae0157564ad}|#%b{A9EF3FD9-4240-455E-A4D5-F2B3301887B2}|1%b#
        System sounds session: yes
        Peak value: 0.0502903

    -- Active session #5 --
        Icon path:
        Display name:
        Grouping parameter: {2a3e30fb-2ded-471e-9c2f-cbd8572b2af2}
        Process ID: 15948 (single-process)
        Session identifier: {0.0.0.00000000}.{125eeed2-3cd2-48cf-aac9-8ae0157564ad}|\Device\HarddiskVolume1\Program Files (x86)\VideoLAN\VLC\vlc.exe%b{00000000-0000-0000-0000-000000000000}
        Session instance identifier: {0.0.0.00000000}.{125eeed2-3cd2-48cf-aac9-8ae0157564ad}|\Device\HarddiskVolume1\Program Files (x86)\VideoLAN\VLC\vlc.exe%b{00000000-0000-0000-0000-000000000000}|1%b15948
        System sounds session: no
        Peak value: 0.287567
        HWND: 0x0000000000C8160C Opening Ceremony - VLC media player

    Active sessions: 5

    Part of this was logging the grouping parameter, which is a GUID. I've seen a lot of code that converts the GUID to a string and logs it using %s. Another way is to use a couple of macros and let the format string do the conversion for you:

    #define GUID_FORMAT L"{%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x}"
    #define GUID_VALUES(g) \
    g.Data1, g.Data2, g.Data3, \
    g.Data4[0], g.Data4[1], g.Data4[2], g.Data4[3], \
    g.Data4[4], g.Data4[5], g.Data4[6], g.Data4[7]

    ...

    GUID someGuid = ...;

    LOG(L"The value of someGuid is " GUID_FORMAT L".", GUID_VALUES(someGuid));

    Standard caveats about not using side effects inside a macro apply. For example, this would be a bug:

    for (GUID *p = ...) {
        LOG(L"p = " GUID_FORMAT L".", GUID_VALUES(*(p++)); // BUG!
    }

     Source, amd64 binaries, and x86 binaries attached.

     

  • Matthew van Eerde's web log

    Buffer size alignment and the audio period

    • 2 Comments

    I got an email from someone today, paraphrased below:

    Q: When I set the sampling frequency to 48 kHz, and ask Windows what the audio period is, I get exactly 10 milliseconds. When I set it to 44.1 kHz, I get very slightly over 10 milliseconds: 10.1587 milliseconds, to be precise. Why?
    A: Alignment.

    A while back I talked a bit about the WASAPI exclusive-mode alignment dance. Some audio drivers have a requirement that they deal in buffer sizes which are multiples of a certain byte size - for example, a common alignment restriction for HD Audio hardware is 128 bytes.

    A more general audio requirement is that buffer sizes be a multiple of the size of a PCM audio frame.
    For example, suppose the audio format of a stream is stereo 16-bit integer. A single PCM audio frame will be 2 * 2 = 4 bytes. The first two bytes will be the 16-bit signed integer with the sample value for the left channel; the last two bytes will be the right channel.
    As another example, suppose the audio format of a stream is 5.1 32-bit floating point. A single PCM audio frame will be 6 * 4 = 24 bytes. Each of the six channels are a four-byte IEEE floating-point value; the channel order in Windows will be {Left, Right, Center, Low-Frequency Effects, Side Left, Side Right}.

    The audio engine tries to run at as close to a 10 millisecond cadence as possible, subject to the two restrictions above. Given a "desired minimum interval" (in milliseconds), and a streaming format, and an "alignment requirement" in bytes, you can calculate the closest achievable interval (without going under the desired interval) as follows:

    Note: this only works for uncompressed formats
    aligned_buffer(desired_milliseconds, format, alignment_bytes)
        desired_frames = nearest_integer(desired_milliseconds / 1000.0 * format.nSamplesPerSec)
        alignment_frames = least_common_multiple(alignment_bytes, format.nBlockAlign) / format.nBlockAlign
        actual_frames = ceiling(desired_frames / alignment_frames) * alignment_frames
        actual_milliseconds = actual_frames / format.nSamplesPerSec * 1000.0

    Here's a table of the actual buffer size (in frames and milliseconds), given a few typical inputs:

    Desired (milliseconds) Format Alignment (bytes) Desired frames Alignment (frames) Actual (frames) Actual (milliseconds)
    10 44.1 kHz stereo 16-bit integer 128 441 32 448 10.16
    10 48 kHz stereo 16-bit integer 128 480 32 480 10
    10 44.1 kHz 5.1 16-bit integer 128 441 32 448 10.16
    10 48 kHz 5.1 16-bit integer 128 480 32 480 10
    10 44.1 kHz 5.1 24-bit integer 128 441 64 448 10.16
    10 48 kHz 5.1 24-bit integer 128 480 64 512 10.66

    So to be really precise, the buffer size is actually 640/63 = 10.158730 milliseconds.

  • Matthew van Eerde's web log

    An attempt to explain the twin prime conjecture to a five-year-old

    • 0 Comments

    Back in April, Zhang Yitang came up with a result that is a major step toward proving the twin prime conjecture that there are infinitely many primes p for which p + 2 is also prime.

    In a reddit.com/r/math thread on the subject, I made the following comment as an attempt to explain the twin prime conjecture to a five-year-old:

    ELI5 attempt at the twin prime conjecture

    Think of cookie parties.

    If you have 100 cookies, you could have a cookie party:

    • by yourself (you get all the cookies)
    • with two people (each person gets 50 cookies)
    • with four people (each person gets 25 cookies)
    • with five people (each person gets 20 cookies)
    • with ten people (each person gets ten cookies)
    • with 20 people (each person gets five cookies)
    • with 25 people (each person gets four cookies)
    • with 50 people (each person gets two cookies)
    • with 100 people (each person gets one cookie)

    If you're the only person at your party, it's a sad party.

    If everyone at the party gets only one cookie, it's a sad party.

    If someone gets more than someone else, it's a sad party.

    You don't want your party to be sad, so you have to be careful to have the right number of people to share your cookies.

    If you have two cookies, or three, or five, or seven, or eleven, then it's not possible to have a happy party. There's no "right number of people."

    People used to wonder whether you could be sure to have a happy party if you just had enough cookies. A famous person named Euclid figured out that, no matter how many cookies you had, even if it was, like, more than a million, you might be unlucky and have a sad number of cookies.

    If it's a birthday party, the birthday kid's mom might give the birthday kid an extra cookie. (Or they might get something else instead.) That would be OK.

    If it's a birthday party, then, yes, you can be sure to have a happy party if you just had enough cookies. In fact, even three cookies would be enough; you could have the birthday kid, and one friend; they would each have one cookie, and the birthday kid would get the extra one.

    But Sam and Jane have a problem. They're twins, and they always have the same birthday. One year they had 13 cookies, and it was a big problem. 13 is a sad number. Even if they both had an extra cookie, that would leave 11, and that is still a sad number.

    (If you allow the birthday kid to have two extra cookies, that would leave nine; they could invite one more person, give everyone three cookies, and then Sam and Jane could each have two extras. But this is not a happy party because the guests will get upset that the birthday kids got two extra cookies. I mean, come on!)

    Sam and Jane wondered whether they could be sure to have a happy party if they just had enough cookies.

    So they asked their mom, who is, like, super smart.

    But even she didn't know.

    In fact, no-one knows. They don't think so. But they're not, like, super-sure.

  • Matthew van Eerde's web log

    Grabbing the output of the Microsoft Speech API text-to-speech engine as audio data

    • 3 Comments

    A while ago I wrote a post on Implementing a "say" command using ISpVoice from the Microsoft Speech API which showed how to use Speech API to do text-to-speech, but was limited to playing the generated audio out of the default audio device.

    Recently on the Windows Pro Audio forums, user falven asked a question about how to grab the output of the text-to-speech engine as a stream for further processing.

    Here's how to do it.

    The key part is to use ISpStream::BindToFile to save the audio data to a .wav file, and ISpStream::SetBaseStream to save to a given IStream. Then call ISpVoice::SetOutput with the ISpStream, prior to calling ISpVoice::Speak.

                ISpStream *pSpStream = nullptr;
                hr = CoCreateInstance(
                    CLSID_SpStream, nullptr, CLSCTX_ALL,
                    __uuidof(ISpStream),
                    (void**)&pSpStream
                );
                if (FAILED(hr)) {
                    ERR(L"CoCreateInstance(ISpVoice) failed: hr = 0x%08x", hr);
                    return -__LINE__;
                }
                ReleaseOnExit rSpStream(pSpStream);
               
                if (File == where) {
                    hr = pSpStream->BindToFile(
                        file,
                        SPFM_CREATE_ALWAYS,
                        &SPDFID_WaveFormatEx,
                        &fmt,
                        0
                    );
                    if (FAILED(hr)) {
                        ERR(L"ISpStream::BindToFile failed: hr = 0x%08x", hr);
                        return -__LINE__;
                    }
                } else {
                    // stream
                    pStream = SHCreateMemStream(NULL, 0);
                    if (nullptr == pStream) {
                        ERR(L"SHCreateMemStream failed");
                        return -__LINE__;
                    }
                   
                    hr = pSpStream->SetBaseStream(
                        pStream,
                        SPDFID_WaveFormatEx,
                        &fmt
                    );
                    if (FAILED(hr)) {
                        ERR(L"ISpStream::SetBaseStream failed: hr = 0x%08x", hr);
                        return -__LINE__;
                    }
                }
               
                hr = pSpVoice->SetOutput(pSpStream, TRUE);
                if (FAILED(hr)) {
                    ERR(L"ISpVoice::SetOutput failed: hr = 0x%08x", hr);
                    return -__LINE__;
                }

    Updated source and binaries attached.

    Usage:

    >say.exe
    say "phrase" [--file <filename> | --stream]
    runs phrase through text-to-speech engine
    if --file is specified, writes to .wav file
    if --stream is specified, captures to a stream
    if neither is specified, plays to default output

    Here's how to generate a .wav file (uh.wav attached)

    >say.exe "uh" --file uh.wav
    Stream is 1

    And here's how to generate an output stream. The app consumes this and prints the INT16 sample values to the console. uh.txt attached.

    >say.exe "uh" --stream
    Stream is 1
           0        0;        0        0;        0        0;        0        0
           0        0;        0        0;        0        0;        0        0
    ...
          86       86;    -1052    -1052;    -2839    -2839;    -3774    -3774
       -4199    -4199;    -4581    -4581;    -4284    -4284;    -3640    -3640
       -3100    -3100;    -2011    -2011;     -393     -393;      533      533
    ...

Page 1 of 6 (141 items) 12345»