For those fortunate enough not to know, an FBAR is a form required by the US Government for all US Persons who have assets worth more than $10,000 in another country. And once you have more than that, they want all the details, no matter how big or small. And to make matters worse, while they will allow you to fill in your personal details just once, if you have a joint account, you need to identify that individual for each and every jointly held asset.
And you need to do this every year, even though almost no information changes from year to year.
So, for a long time, I have wanted to automate filling this form in. Normally, I download the PDF version and complete it. Two years ago, I tried to see if I could automate that using the PDFBox tool, but for various reasons that did not work. But last year, I discovered that there is also an online version of the form, and a few weeks ago I discovered the Playwright Chrome Driver library. So …
Let's download Playwright
One of the cool things about Playwright (from my perspective) is that it has an API in Java. So I'm going to use that. In order to build everything, I'm going to use gradle since that seems quite common these days, and so to start with I'm going to have this build.gradle file:plugins {Following along from the documentation, the first step is to create a central Playwright instance and then use that to open a browser window. I tend to use Chrome, so that's what I'm doing here, but you can also use Webkit or Firefox.
id 'java'
id 'application'
}
mainClassName = 'ignorance.FBAR'
repositories {
jcenter()
}
dependencies {
implementation 'com.microsoft.playwright:playwright:1.30.0'
}
task copyToLib(type: Copy) {
from configurations.default
into "$buildDir/output/lib"
}
package ignorance;(In order to get this to work with eclipse, at least for me, I needed to copy the files into a local directory, for which I created the gradle task copyToLib, ran it, and then added the copied JAR files to my classpath).
import com.microsoft.playwright.Browser;
import com.microsoft.playwright.Page;
import com.microsoft.playwright.Playwright;
class FBAR {
public static void main(String[] argv) {
try (Playwright playwright = Playwright.create()) {
Browser browser = playwright.webkit().launch();
Page page = browser.newPage();
page.navigate("http://whatsmyuseragent.org/");
}
}
}
When I first run this, it downloads a whole bunch of files, which appear to be the browsers it supports. And then, to my (not) very great surprise, it threw an exception:
Caused by: com.microsoft.playwright.impl.DriverException: Error {Now, I have basically no idea what this means, but I'm going to assume I haven't set something up correctly. Before panicking though, I'm going to try it again. No, still no joy.
message='Target closed
name='Error
stack='Error: Target closed
Reviewing the code, I realized I was a little over-zealous with my copying and, instead of launching chromium as planned, I launched webkit. I'm not really sure what that does, or what browser it would use, but changing it to chromium certainly solves the problem.
And I also want to see what I am doing, so I have added the headless-off and slowmo options to the launch configuration. And so we have the following:
Browser browser = playwright.chromium().launch(new BrowserType.LaunchOptions().setHeadless(false).setSlowMo(50));
Page page = browser.newPage();
page.navigate("https://bsaefiling1.fincen.treas.gov/lc/content/xfaforms/profiles/htmldefault.html");
Filling in some fields
So, since I don't know what I am doing, I am going to just start randomly filling in the form (my overall plan is to pull all the info from my personal records, probably through a JSON intermediary) using the most stable references I can find. If you haven't pulled the link from the code, the form I'm filling in is here. And obviously, I have this open in a regular Chrome window with the inspector on so I can find things in it.It seems to me the best way of identifying the first field - Email Address - is to use the div with class EmailAddress and then find the input within that. So let's do that and give my official email address - mickey.mouse@disney.com.
page.fill("div.EmailAddress input", "mickey.mouse@disney.com");OK, that didn't work. It loaded the form, and then paused for a long while (30s to be precise) before giving up and telling me it couldn't find the div:
Timeout 30000ms exceeded.And, after mature reflection, I realized that the div class is actually Email, not EmailAddress, so let's try that again:
=========================== logs ===========================
waiting for locator("div.EmailAddress input")
============================================================
page.fill("div.Email input", "mickey.mouse@disney.com");Indeed, that does work. So let's quickly fill in the rest of the details in the form and check in.
page.fill("div.Email input", "mickey.mouse@disney.com");The next thing to do is to "start" filling in the form. I'm not quite sure why those first fields don't count as filling in the form, I don't know. This involves pushing the "Start FBAR" button, which is the click action on the page.
page.fill("div.ConfirmedEmail input", "mickey.mouse@disney.com");
page.fill("div.FirstName input", "Mickey");
page.fill("div.LastName input", "Mouse");
page.fill("div.PhoneNumber input", "770-555-1234");
But, as I go to look at the documentation for this, I discover that both fill and click on the Page have been deprecated in favour of locators, so I am going to digress for a moment into refactoring to use these.
It would seem that this is an attempt to abstract away CSS selectors, and this makes a lot of sense to me. It's more typing, to be sure, but we should never be afraid of trading typing for reliability and correctness.
Interestingly, in doing this, it turns out that there are a number of "duplicate" entries in the form. In particular, because it appears that it just matches some of the text you provide, the phrase "Enter your email address" also matches the confirmation message. To clarify, it is necessary to add .setExact(true) to the end. But I have to say, the error message is exceedingly helpful and clear:
Error: strict mode violation: getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Enter your email address.")) resolved to 2 elements:So, with the refactoring done and the click() added, let's check in again:
1) <input type="text" class="_O" name="Email_5" placehold…/> aka getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Enter your email address.").setExact(true))
2) <input type="text" class="_O" placeholder="" maxlength…/> aka getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Re-enter your email address."))
page.getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Enter your email address.").setExact(true)).fill("mickey.mouse@disney.com");So now we can quickly fill in the next few fields. I am hoping that I will never file this form late again (because it will be so easy when I have this script working!) but in the past few years I have struggled because they moved the deadline from June 30 to April 15 (to align with the US tax year). And I keep forgetting that. So, for the purposes of this blog, I will choose a reason and provide an explanation.
page.getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Re-enter your email address.")).fill("mickey.mouse@disney.com");
page.getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Enter your first name.")).fill("Mickey");
page.getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Enter your last name.")).fill("Mouse");
page.getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Enter your telephone number. Do not include formatting such as spaces, dashes, or other punctuation.")).fill("770-555-1234");
page.getByRole(AriaRole.BUTTON, new Page.GetByRoleOptions().setName("Please click this button to begin preparing your FBAR.")).click();
page.getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Filing name")).fill("Mouse FBAR 2022");Once again, it fails. And once again, playwright's exception message is very clear:
page.getByRole(AriaRole.COMBOBOX, new Page.GetByRoleOptions().setName("reason")).selectOption("A");
page.getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Explanation")).fill("I keep forgetting the deadline has changed.");
Timeout 30000ms exceeded.The element is not enabled. Checking by hand, it seems that "I forgot" is enough of an explanation and that you only need to provide an explanation if you choose "Other". I'm not that bothered, so I'm just going to comment all that lot out and move on.
=========================== logs ===========================
waiting for getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Explanation"))
locator resolved to <textarea class="_k" placeholder="" maxlength="750" tabin…></textarea>
elementHandle.fill("I keep forgetting the deadline has changed.")
waiting for element to be visible, enabled and editable
element is not enabled - waiting...
============================================================
page.getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Filing name")).fill("Mouse FBAR 2022");
// page.getByRole(AriaRole.COMBOBOX, new Page.GetByRoleOptions().setName("reason")).selectOption("A");
// page.getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Explanation")).fill("I keep forgetting the deadline has changed.");
Filling in the joint assets
Right, well, so far, I don't think I've achieved anything very much. As my wife would say, "you could have done that by hand with a lot less typing". Fair enough. So let's skip to the interesting part of the operation (page 3 is just more information about the primary filer which only needs to be provided once). Parts II and III reflect accounts owned individually or jointly, and the forms can be duplicated by using the appropriate + button in the top right hand corner of the page. Now, as noted above, the main thing I want to do is not provide my wife's details ten times on the ten copies of the page for each of the forms (I don't actually want to do any of it) but this is the thing that drives me crazy).In order to show the important things, then, I'm going to create two classes now: Portfolio, which holds all my assets, and JointAsset which holds the information about a joint asset. Because this shares most of its information with an individually held asset, I'm only going to have this have two fields: an AccountInfo and a Asset; I'm going to reuse the former when I go back and fill in Part I.
In the fullness of time, I will extract all the information about the portfolio from its ultimate sources of truth; for now, I am just going to hack something together. Anyway, it's all in PortfolioLoader.java:
package ignorance;All the other classes I created are just boring POJOs, although you could think of them as DTOs between the two systems (the loader and the form-filler).
public class PortfolioLoader {
public Portfolio load() {
Portfolio ret = new Portfolio();
AccountInfo me = new AccountInfo();
AccountInfo other = new AccountInfo();
ret.user(me);
ret.joint(new JointAsset().jointWith(other).setMaximumValue(10000).setType("A"));
return ret;
}
}
For now, we are just going to try and load one account. This should not be too difficult. Having said that, we are going to build it as if we are loading multiple accounts and just throw an error if we reach the second.
So, we start by doing the obvious thing:
page.getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("*15")).fill(Integer.toString(joint.getMaximumValue()));which should identify the maximum account value field, but in fact, there are four of them (one in each of sections II, III, IV and V). So that doesn't work that well. We need some means of distinguishing them.
Looking through the structure there is a div with a class subForm Part3 and that would seem enough of a distinction. Note that although we would prefer to use a nice, stable mechanism for identifying the fields, in a pinch it is still possible to use a selector. So let's do that now.
Very good. That works. Let's check in again.
boolean first = true;So the one remaining thing I'm interested in experimenting with before I get serious and start integrating things is to try adding a second page for a second asset. Adding the second asset to PortfolioLoader is easy enough:
for (JointAsset joint : portfolio.joints()) {
if (!first) {
throw new RuntimeException("Not implemented");
}
first = false;
Locator mypage3 = page.locator("div.subform.Part3");
mypage3.getByRole(AriaRole.TEXTBOX, new Locator.GetByRoleOptions().setName("*15")).fill(Integer.toString(joint.getMaximumValue()));
mypage3.getByRole(AriaRole.COMBOBOX, new Locator.GetByRoleOptions().setName("*16")).selectOption(joint.getType());
}
Thread.sleep(10000);
ret.joint(new JointAsset().jointWith(other).setMaximumValue(10000).setType("A"));which is all fine and dandy until we reach the exception we included earlier for the "more than one" case. Now we need to go back and handle that.
ret.joint(new JointAsset().jointWith(other).setMaximumValue(20000).setType("B"));
return ret;
The first thing to do is to add another page by clicking on the "+" button. This has "+" as its aria label, so we can do this quite easily:
for (JointAsset joint : portfolio.joints()) {which works first time, but then puts us in the situation where it cannot resolve which of the two "Maximum Value" entries it should be considering:
Locator mypage3 = page.locator("div.subform.Part3");
if (!first) {
mypage3.getByRole(AriaRole.BUTTON, new Locator.GetByRoleOptions().setName("+").setExact(true)).click();
Error: strict mode violation: locator("div.subform.Part3").getByRole(AriaRole.TEXTBOX, new Locator.GetByRoleOptions().setName("*15")) resolved to 2 elements:It turns out that the Locator abstraction is happy to contain one or many potential DOM nodes, until you decide to do something with them - then it complains about the fact that it cannot choose. But we can easily force that choice using the last() operator (we could use first() or nth(), but since the new forms are added at the end, last() is what is wanted).
1) <input class="_s" value="10000" type="numeric" placeho…/> aka locator("input[name=\"MaxAcctValue_137\"]")
2) <input class="_s" type="numeric" placeholder="" tabind…/> aka locator("input[name=\"MaxAcctValueCL_1676554053694\"]")
So now we have this code:
Locator mypage3 = page.locator("div.subform.Part3");
if (!first) {
mypage3.getByRole(AriaRole.BUTTON, new Locator.GetByRoleOptions().setName("+").setExact(true)).click();
mypage3 = mypage3.last();
Conclusion
In the space of this blog post, I have convinced myself that Playwright is a reasonable tool for interacting with websites. Hopefully after the conclusion, I will be able to go on and build a tool to import JSON files and fill out FBARs. This will save me a headache for years to come - and may be of use to others too! The repository will be updated to include the final version, even though it is not shown here.Addendum
In the process of working through the rest of the features, I discovered a number of wrinkles that needed to be resolved.Selecting the country on this form is a little tricky, as it has an aria-label which is the empty string. I don't think there is anything that can capture that. I solved this problem by instead selecting based on a good, old-fashioned CSS locator. It's not as elegant, but it does the trick.
mypage3.locator("div.partSub div.choicelist.Country select").selectOption(joint.getCountry());When filling out the address of the owners, it turns out that there is JavaScript logic that connects the country to the list of states. Before a state can be selected, this logic must be run. For whatever reason, this happens in real life but not in Playwright. A little bit of googling suggested that a "blur" event was necessary. (By the way, I discovered the monitorEvents operation in the Chrome Developer Tools while I was doing this: check it out).
with.getByRole(AriaRole.COMBOBOX, new Locator.GetByRoleOptions().setName("33")).selectOption(other.getCountry());Early on, I had tested that I could add additional pages to the document. What I had not considered was that this would add additional + buttons. So when I came to add a third asset, it could not tell which + button to push. A simple last() fixed that problem:
with.getByRole(AriaRole.COMBOBOX, new Locator.GetByRoleOptions().setName("33")).dispatchEvent("blur");
mypage3.getByRole(AriaRole.BUTTON, new Locator.GetByRoleOptions().setName("+").setExact(true)).last().click();At the end of the operation, I manually sign and submit the form - which gives me an opportunity to download what I have submitted. Sadly, this download ended up somewhere in the ether (possibly just in memory) where I could not find it. Before next year, I need to figure this out and save it responsibly to somewhere I can keep records of it.