TILL programs have a grid of squares. It would be good if, for each member of that grid, we could list out the label and all the styles that have been applied.
So, again, for cheapness' sake, I'm going to use a table. Column 1 is the row number; Column 2 shows the column number; Column 3 has the label and Column 4 has the list of styles.
It's easy enough to put that table in place in the HTML, so I've gone ahead and done that. Now, when we hit a breakpoint, we want to scrape the HTML and render that in the table in the side panel.
I don't think any of this is that complicated, but there are a couple of things I haven't done before.
Gathering the Styles and Text on Break
The biggest issue as far as I can see is deciding on a strategy. My immediate thought was to use a content script to scrape the HTML and use message passing from the debugger to the content script to trigger the action, then another message from the content script to the sidepanel in order to display it.But then it occurred to me that we could also use "the DOM domain" from the service worker thread and then send the results directly to the side panel.
I considered this from different angles and I think I like the content script angle better, but let's give the DOM domain approach a go anyway.
Time passes ...
If you want to look at where I got to, you can check out the tag CDP_BROKEN_FIND_DOM. I made quite a bit of progress, but I didn't like the way the code was shaping up with lots of asynchronicity, but the killer problem would seem to be that the "DOM domain" does not have a function to recover the text of the node. The method requestChildNodes does not seem to return text nodes, or maybe I just wasn't using it correctly. Other than that, I could have made it work, just stopped.
In particular, I found myself in one of those "variable capture" scenarios that always make you break your code up more than you really want to, and I didn't want to either fix it or present "broken" and "hard to understand" code here, so I just gave up.
I turned to the content script approach and managed to get something working quite quickly that I actually liked. I'm not going to present that either, because when I say I manged to get something working quite quickly, what I didn't appreciate for a long while was that the content script will only look at the incoming message when the code continues from the breakpoint. Which, for me, is completely useless.
I tried to re-orient this using a function and then call that using the debugger domain's evaluateAtCallFrame method, but that claimed it could not see the function I had defined in my content script, which seemed plausible, but it seemed unfair that it treated the content script as "part of" the main window when that was unhelpful to me, but then didn't when that was unhelpful to me.
Again, I feel there is something I should be able to do, but could not figure it out.
What to do in this situation? Walk away and Google things.
I had probably seen the getOuterHTML method in the DOM domain before, but I don't want the outer html, with all its markup, but the inner text, with no markup.
But beggars can't be choosers.
So, I pulled my dead first attempt back from the brink, added code to get the outer html, fixed all the asynchronicity issues using a Tracker object like I used when recovering the state, and then dragged in the display code I had added for my content script approach when I thought it was working and ... more or less, everything worked.
What I Finally Did
So I'm now going to present the code that worked as if it was what I originally thought of, which it absolutely was not. As I say, if you want to look at the false trails, they are there, but I don't want you (or AI) blindly copying those as if they work.So one of the bigger problems in using asynchronous APIs like the debugger and DOM domains is that you can't be quite sure how things come back. You can, of course, use async and await to linearize things, but you need to understand that, indeed, that will linearize things and you (may) pay a significant performance hit.
So, I built a Collector object like the Tracker object I built last time to collect all the DOM entries. It works in the same way: before you make a request, tell me what you are expecting back, and I will prepare the ground for it. When it all comes back, we can send the final response.
class RowCollector {
constructor() {
this.rows = [];
this.need = 1;
this.have = 0;
}
ready(nrows) {
this.need += nrows;
for (var i=0;i<nrows;i++) {
this.rows[i] = { rowNum: i, rowInfo: [] };
}
this.sendWhenComplete();
}
rowHas(row, ncols) {
this.need += ncols * 2;
var cs = this.rows[row].rowInfo;
for (var i=0;i<ncols;i++) {
cs[i] = { colNum: i };
}
this.sendWhenComplete();
}
attrs(row, col, attrs) {
this.rows[row].rowInfo[col].styles = attrs;
this.sendWhenComplete();
}
html(row, col, html) {
this.rows[row].rowInfo[col].outer = html;
this.sendWhenComplete();
}
sendWhenComplete() {
this.have++;
console.log("need", this.need, "have", this.have);
if (this.need == this.have) {
console.log("sending collected dom", this.rows);
chrome.runtime.sendMessage({ action: "present-dom", info: this.rows });
}
}
}
CDP_DOM_DOMAIN_RECOVERY:cdp-till/plugin/js/service-worker.js
Hopefully what this does will become clear as I explain each of the methods and how it calls this. As before, there is an intentional "asymmetry" (or symmetry, perhaps) between initializing need as 1 and then calling sendWhenComplete at the end of ready(), which increments have before testing if we have all the entries we need. As before, it's important to make sure that all the functions here increment need before calling sendWhenComplete so that it can never fire too soon. sendWhenComplete is called at the end of each of the setup methods both to increment have and to cover the case that a given array is empty: in that case its "inner" methods will never be called and it is possible that we are complete already.if (stepMode || breakpointLines[lineNo]) {
breakpointSource = source;
chrome.runtime.sendMessage({ action: "hitBreakpoint", line: lineNo });
chrome.debugger.sendCommand(source, "Debugger.evaluateOnCallFrame", { callFrameId: params.callFrames[0].callFrameId, expression: "state" }).then(state => {
copyObject(source, {}, state.result, copy => {
chrome.runtime.sendMessage({ action: "showState", state: copy });
})
});
var ret = new RowCollector();
chrome.debugger.sendCommand(source, "DOM.getDocument", {}).then(doc => {
findRows(ret, doc.root.nodeId)
});
} else {
CDP_DOM_DOMAIN_RECOVERY:cdp-till/plugin/js/service-worker.js
When we hit a breakpoint, in addition to notifying the side panel and collecting the state, we now want to get the document using the DOM.getDocument method in the DOM domain.This doesn't actually obtain the whole document of course; it just returns an opaque handle to a document object. Within this is root which is the "document element". The main feature of each of the Node objects returned by this API is the nodeId which is an integer identifying each of the nodes on the far side. We can then interact with those using other methods in the DOM domain. To avoid issues with variable capture, all of these interactions are in a pair of very small methods.
findRows scans the document (using querySelectorAll), looking for the .row elements, and calls the collector with the number of rows returned. As you can see above, that then initializes the return array with the relevant number of rows, assigning each one its number.
function findRows(collector, nodeId) {
chrome.debugger.sendCommand(breakpointSource, "DOM.querySelectorAll", { nodeId: nodeId, selector: ".row" }).then(rows => {
collector.ready(rows.nodeIds.length);
findColumns(collector, rows.nodeIds);
});
}
CDP_DOM_DOMAIN_RECOVERY:cdp-till/plugin/js/service-worker.js
Here, we delegate finding the columns to a nested method:function findColumns(collector, rows) {
var rowNum = 0;
for (var rowNum=0;rowNum<rows.length;rowNum++) {
collectRow(collector, rowNum, rows[rowNum]);
}
}
CDP_DOM_DOMAIN_RECOVERY:cdp-till/plugin/js/service-worker.js
Which then uses collectRow to capture the rowNum correctly and dispatch the async method:function collectRow(collector, rowNum, row) {
chrome.debugger.sendCommand(breakpointSource, "DOM.querySelectorAll", { nodeId: row, selector: ".cell" }).then(cols => {
collector.rowHas(rowNum, cols.nodeIds.length);
findCells(collector, rowNum, cols.nodeIds);
});
}
CDP_DOM_DOMAIN_RECOVERY:cdp-till/plugin/js/service-worker.js
Here the collector is told about the contents of the row, and it initializes the rowInfo structure while adding two to need for each cell, since we will be obtaining both styles and test asynchronously.And then the pattern repeats trying to collect the cell within each column:
function findCells(collector, rowNum, cols) {
var colNum = 0;
for (var colNum=0;colNum<cols.length;colNum++) {
collectCell(collector, rowNum, colNum, cols[colNum]);
}
}
CDP_DOM_DOMAIN_RECOVERY:cdp-till/plugin/js/service-worker.js
Which then uses collectCell to capture the colNum correctly and dispatch the async methods to obtain the styles and the outer html:function collectCell(collector, rowNum, colNum, c) {
chrome.debugger.sendCommand(breakpointSource, "DOM.getAttributes", { nodeId: c }).then(
attrs => collector.attrs(rowNum, colNum, attrs.attributes[1])
);
chrome.debugger.sendCommand(breakpointSource, "DOM.querySelector", { nodeId: c, selector: ".cell-text" }).then(
res => {
chrome.debugger.sendCommand(breakpointSource, "DOM.getOuterHTML", { nodeId: res.nodeId }).then(
html => collector.html(rowNum, colNum, html.outerHTML)
);
}
);
}
CDP_DOM_DOMAIN_RECOVERY:cdp-till/plugin/js/service-worker.js
Note that collecting the cell text involves two steps: one is finding the div with class cell-text, and the other is finding the "outer html" for that cell. Even so, no additional counting is required in the collector, because these two balance out and the collector is not involved.Note that both these methods tell the collector the explicit row and column numbers so that it can put the entry in the right place; the earlier methods made sure that all the arrays were already in place and populated.
Finally, when we have gathered together all of the DOM information, sendWhenComplete is complete, and we send the finished array to the side panel.
Displaying the Results
Displaying the results is really nothing different to what we've done twice before, which is to populate a table. And I've tried to make it easy by preparing the data to be in the most amenable form.What I had tried to do on my dead branch was to present the text and styles "cleanly", but given the fact that I can't recover the simple text now, that isn't possible anymore, so I'm just going to show the outer html and all of the styles (by definition, they all have cell, which isn't really a style so much as a marker).
We need to get the table body:
var dombody = document.getElementById("display-dom");
CDP_DOM_DOMAIN_RECOVERY:cdp-till/plugin/html/js/sidepanel.js
In the onMessage handler, we need to handle the present-dom method:chrome.runtime.onMessage.addListener(function(request, sender, respondTo) {
switch (request.action) {
case "hitBreakpoint": {
var l = request.line;
breakAt = sourceLines[l].children[1];
debuggerActive();
break;
}
case "showState": {
if (debugMode) {
showState(request.state);
}
break;
}
case "present-dom": {
if (debugMode) {
presentDom(request.info);
}
break;
}
default: {
console.log("message:", request);
break;
}
}
});
CDP_DOM_DOMAIN_RECOVERY:cdp-till/plugin/html/js/sidepanel.js
Which calls presentDom, which is just basic table construction much like when we did the state table:function presentDom(dom) {
dombody.innerHTML = '';
for (var r of dom) {
for (var c of r.rowInfo) {
var text = c.outer;
presentDomRow(r.rowNum, c.colNum, text, c.styles);
}
}
}
function presentDomRow(row, col, label, styles) {
var tr = document.createElement("tr");
dombody.appendChild(tr);
presentCell(tr, row);
presentCell(tr, col);
presentCell(tr, label);
presentCell(tr, styles);
}
function presentCell(tr, str) {
var td = document.createElement("td");
td.appendChild(document.createTextNode(str));
tr.appendChild(td);
}
CDP_DOM_DOMAIN_RECOVERY:cdp-till/plugin/html/js/sidepanel.js
And there we have it (at long last)Conclusions
I can't believe that this was as complicated as it turned out to be. I'm also very confused whether it is my incompetence, lack of clarity in the documentation, or lack of consideration of this case which caused me to have so many problems. In particular, the fact that there doesn't seem to be a way of obtaining text nodes in the DOM domain seems implausible to me. Likewise the fact that content scripts are "isolated" from the main document JS environment, but nevertheless won't run when it has hit a breakpoint.Am I missing something? If so, please let me know.
That notwithstanding, we did manage to scrape the DOM for styles and HTML, and we could have done anything we wanted if I had been prepared to put in the effort (and hack stuff), but it really wasn't that important to me and I'd used up my "discretionary" time bucket going around in circles. But maybe, at the end of the day, it was more instructive than just being successful first time.