So the HMRC replied to me promptly and helpfully as usual.
In their reply, they included the following important information:
- The IRmark is calculated from the payload of the submission, from (and including) the first <Body> element all the way to and including the last <Body> element. (From experimentation, I can confirm that this does not include any spaces before or after the tags).
- They pointed me to this collection of information (see below).
- They gave me the IRmark that their system was calculating from my submission, which makes matching so much easier.
Section 1.1 answers questions I had and addresses the main problem I had. In fact, if I had had this document last time, I think I would have made it to the finish line.
The IRmark is generated from the payload of the submission so this part of the XML must be extracted first. The payload is everything inside and including the <Body></Body> node. When you extract the body you must “inherit” any and all namespace declarations in the <GovTalkMessage> node and place them in the <Body> node.
I would never have thought to do this, but it is certainly something that I can do in generation. The other important thing, although phrased differently to the way I would say it, is also in section 1.1:
Finally, to prepare the XML you need to remove the IRmark node from the <Body>. However you choose to do this any data around the IRmark opening and closing tags e.g. white space, line-endings, tabs etc must be preserved.
In other words, when inserting the IRmark node, it's important not to also insert any additional whitespace.
Even more helpfully, this collection of information also includes a worked sample, including the submitted message, which includes the govtalk header and the calculated IRmark and the canonicalised Body which clearly shows the changes above and can be used to check the algorithm in full knowledge that it is correct.
This enabled me to "tinker" with my submission until I was able to generate the same IRmark that they did, which, after all, is the objective here.
So, now let's go and do the same thing with our generated code.
The Correct IRmark Implementation
So, we need to generate the <Body> first, and then we need to process that, generate the IRmark, manipulate the body (as text; the IRmark process is incredibly sensitive) and then wrap all of that in the GovTalk message. I'm going to present the code as it ended up, rather than any intermediate steps.func (gtm *GovTalkMessage) AsXML() (*GovTalkMessageXML, error) {...
var body *SimpleElement
var canonBody string
if gtm.opts.IncludeBody {
body = gtm.makeBody()
var err error
canonBody, err = canonicaliseBody(body)
if err != nil {
return nil, err
}
}
gt := MakeGovTalkMessage(
canonBody,
env,
ElementWithNesting("Header", msgDetails, sndrDetails),
gtDetails)
return gt, nil
}
CT600_CORRECT_IRMARK:accounts/internal/ct600/govtalk/govtalk.go
The key difference here is that we have put all the work of dealing with the IRmark - and then generating a text body - into the canonicaliseBody method. We have then put that body in the GovTalkMessage structure. It is used during the final generation of the XML text:func Generate(conf *config.Config, options *govtalk.EnvelopeOptions) (io.Reader, error) {
msg := govtalk.MakeGovTalk(options)
msg.Identity(conf.Sender, conf.Password)
msg.Utr(conf.Utr)
msg.Product(conf.Vendor, conf.Product, conf.Version)
m, err := msg.AsXML()
if err != nil {
return nil, err
}
bs, err := xml.MarshalIndent(m, "", " ")
if err != nil {
return nil, err
}
bs, err = m.AttachBodyTo(bs)
if err != nil {
return nil, err
}
bs = []byte(string(bs) + "\n")
err = checkAgainstSchema(bs)
if err != nil {
return nil, err
}
return bytes.NewReader(bs), nil
}
CT600_CORRECT_IRMARK:accounts/internal/ct600/submission/generate.go
And the process of attaching the body is to just use a placeBefore method that we'll look at in a bit more detail later on.func (gtx *GovTalkMessageXML) AttachBodyTo(bs []byte) ([]byte, error) {
return placeBefore(bs, "</GovTalkMessage>", gtx.canonBody)
}
CT600_CORRECT_IRMARK:accounts/internal/ct600/govtalk/xml.go
Now let's turn and look at canonicaliseBody.func canonicaliseBody(from *SimpleElement) (string, error) {
body := MakeBodyWithSchemaMessage(from.Elements...)
// Generate a text representation
bs, err := xml.MarshalIndent(body, " ", " ")
if err != nil {
return "", err
}
bs, err = placeBefore(bs, "<Sender>", "\n ")
if err != nil {
return "", err
}
// now canonicalise that
decoder := xml.NewDecoder(bytes.NewReader(bs))
out, err := c14n.Canonicalize(decoder)
if err != nil {
return "", err
}
// Generate a SHA-1 encoding
hasher := sha1.New()
_, err = hasher.Write([]byte(out))
if err != nil {
return "", err
}
sha := hasher.Sum(nil)
// And then turn that into Base64
w := new(bytes.Buffer)
enc := base64.NewEncoder(base64.StdEncoding, w)
enc.Write(sha)
enc.Close()
// The string of this is the IRmark
b64sha := w.String()
// remove the "fake" schema
bs, err = deleteBetween(out, "<Body", ">")
if err != nil {
return "", err
}
// Add the IRmark
bs, err = placeBefore(bs, "\n <Sender>", `<IRmark Type="generic">`+b64sha+"</IRmark>")
if err != nil {
return "", err
}
// Fix up whitespace around Body
ret := " " + string(bs) + "\n"
return ret, err
}
CT600_CORRECT_IRMARK:accounts/internal/ct600/govtalk/govtalk.go
The main steps here are as indicated in the comments. I have highlighted the other points of note.We copy the contents of the body into a special Body element that has an associated schema. We then marshal this body with an additional two-space indent, since we will be placing it inside the GovTalk element. We add the extra line (together with indent) for the <IRmark>. Then, after we have done all the calculation (and come up with the IRmark value in b64sha), we delete the schema from the body tag and insert the <IRmark> element in the right place.
And, then, finally, here are the placeBefore and deleteBetween methods which manipulate the XML text buffer directly:
func placeBefore(bs []byte, match string, insert string) ([]byte, error) {
str := string(bs)
s1 := strings.Index(str, match)
if s1 == -1 {
return nil, fmt.Errorf("did not find " + match)
}
str = str[0:s1] + insert + str[s1:]
bs = []byte(str)
return bs, nil
}
func deleteBetween(bs []byte, from string, to string) ([]byte, error) {
canonBody := string(bs)
j := strings.Index(canonBody, from)
if j == -1 {
return nil, fmt.Errorf("did not find " + from)
}
j += len(from)
j1 := strings.Index(canonBody[j:], to)
canonBody = canonBody[0:j] + canonBody[j+j1:]
return []byte(canonBody), nil
}
CT600_CORRECT_IRMARK:accounts/internal/ct600/govtalk/govtalk.go
And, with a deep sigh of relief, I can say "aha! that submits!" and get a <SuccessResponse> back from the government.
No comments:
Post a Comment