Python and Selenium for a no-permissions Salesforce profile

Katie - Aug 23 '19 - - Dev Community

I needed a fresh "no permissions" Salesforce profile in a mature org.

I couldn't get the Metadata API working fast enough to my liking, so I gave up and built a "robot version" of myself with Python and Selenium, hijacked my own browser, and watched fake-me click thousands of buttons at computer speed.

(Does that make my computer itself the #AwesomeAdmin?)

Here's how I did it.

Background

I'm about to test a brand new Pardot training environment I was issued, and I want to connect it to a Salesforce sandbox under a philosophy of least privilege.

That means creating a Salesforce Profile with no object-level CRUD access and no field-level Read/Write access whatsoever.

I'll flip things back on if I need them.

Unfortunately, we don't have any such profiles to clone.

  • I cloned the closest one I could find -- a "read only" one -- but it had accumulated a lot of read/edit (yes, edit, despite its name) access it didn't need.
  • To get it to the "no permissions" status I desired, I had a lot of edits to do.

Read on to see what I tried.


Take 1: Welkin Suite

I thought that The Welkin Suite's Objects Permissions Editor and Field Level Security tools could help me.

  • Indeed, the Objects Permissions Editor made quick work of shutting off object-level CRUD access.
  • However, it was too many clicks, and too much waiting, to pull up hundreds of objects one at a time in the Field Level Security tab of Welkin and shut things off object-by-object.
    • The editor is great for dropping the hammer and zeroing out permissions for a profile, or several, on 5 objects. However, it was still too much work for 150 objects.

Take 2: Metadata API

I thought:

"Welkin, as well as the Salesforce setup page authors, have both built tools that query Salesforce all security-settable fields for an object so as to present them to users.

"I'll just query whatever they're querying and loop over it, building a humongous Pardot Data Sync.profile in XML, ZIP it up, and deploy it to my sandbox in Workbench."

The contents of Pardot Data Sync.profile would look like this:

<?xml version="1.0" encoding="UTF-8"?>
<Profile xmlns="http://soap.sforce.com/2006/04/metadata">
    <fieldPermissions>
        <allowCreate>false</allowCreate>
        <editable>false</editable>
        <field>SomeObjectAPIName.SomeFieldAPIName</field>
        <hidden>true</hidden>
        <readable>false</readable>
    </fieldPermissions>
    ...
    <fieldPermissions>
        <allowCreate>false</allowCreate>
        <editable>false</editable>
        <field>SomeObjectAPIName.SomeFieldAPIName</field>
        <hidden>true</hidden>
        <readable>false</readable>
    </fieldPermissions>
</Profile>
Enter fullscreen mode Exit fullscreen mode

That file would be inside a folder within the .ZIP called "profiles."

At the top level of the .ZIP would be a file called package.xml with the following contents:

<?xml version="1.0" encoding="UTF-8"?>
<Package xmlns="http://soap.sforce.com/2006/04/metadata">
    <fullName>MiscLazyStuff</fullName>
    <types>
        <members>Pardot Data Sync</members>
        <name>Profile</name>
    </types>
    <version>46.0</version>
</Package>
Enter fullscreen mode Exit fullscreen mode

I didn't know how to get a list of SomeObjectAPIName.SomeFieldAPIName values to build my <fieldPermissions>...</fieldPermissions> wrappers around, so I started by writing a Python script to loop over every *.object XML file I had downloaded to my computer with Welkin.

Unfortunately, I ran into no end of metadata deployment errors -- each of which I could only see one of at a time.

  1. First I had to edit my script to ignore formula fields
  2. Then I had to edit it to ignore Master-Detail fields
  3. Then I had to edit it to ignore mandatory lookup fields. A required property doesn't exist on standard fields of standard objects in the .object files, so to be safe, I just ignored all Lookup fields that didn't have a required flag either way.\
  4. And, of course, I had to ignore fields with a required property of true.
  5. But still, there were more standard fields of standard objects that were required without telling me so in their *.object XML definitions.I ran out of patience when the error messages told me that AuthorizationFormConsent.ConsentCapturedDateTime was required but its code in AuthorizationFormConsent.object looks like this:
<fields>
    <fullName>ConsentCapturedDateTime</fullName>
    <trackHistory>false</trackHistory>
</fields>
Enter fullscreen mode Exit fullscreen mode

Wait ... how was I supposed to know that about ConsentCapturedDateTime?!

Where is this documented / extractable??!!**

I spent another 3 hours banging my head against the wall playing with the Tooling API and Apex, trying to find a reliable way to get the same kind of "don't bother with this field" blacklist (or a "do bother with these fields" whitelist) that the Welkin Suite and the authors of Salesforce's web-based Setup pages have obviously managed to come up with before I quit.

I mean, was I supposed to screen-scrape documentation for the word "required" at the beginning of a "Description" paragraph or something?

At a certain point, I was done.

NOTE: Will take input!

Welkin folks, Salesforce employees, or other developers -- I'd love to hear by what algorithms you figure out exactly which fields to present to users as permission-editable.

Until then ... instead of building my own car, I decided to "There I Fixed It" the wheel of a preexisting car. Leading me to ...


Take 3: Setup menu + Selenium (and a bit of Data Loader)

Selenium is a programmable suite of software that acts as middleware between your web browser and a programming language of your choice.

Typically, developers use it to imitate a user clicking buttons on a web page so that they can ensure that their web page works as intended.

  • Computerization saves hassle -- who really wants to click all the buttons that need to be clicked when it's the same thing every time?
  • Computerization scales well -- this is a reliable way to simulate millions of humans clicking your web page at the same time (like if you're a giant e-retailer during major shopping holiday seasons).

I installed it because I knew I needed to "un-click" somewhere between several and hundreds of checkboxes per screen on dozens to hundreds of Salesforce Setup pages that were all predicably like each other.

Sounds like a perfect use case for automating my work!

I could explore what my web browser was doing on my behalf as I clicked the first few buttons, then teach the computer to do it for me, and teach the computer what the subtle variations were to account for when switching from one Salesforce "object" to another.

Step 1: Log In

First, I ran the following Python code

from selenium.webdriver import Chrome
browser = Chrome(executable_path='C:\\exampleprograms\\ChromeDriver\\chromedriver.exe')
browser.get('https://test.salesforce.com')
Enter fullscreen mode Exit fullscreen mode

This caused a special instance of Chrome to pop up on my computer.

I logged into Salesforce with my username and password as usual.

Step 2: Get My Object Links

Then I commented out the code above and, without letting my Python IDE clear out the value of the browser variable (which is normal behavior of my IDE from one "run" button click to the next), I ran the following code:

editableLinksToVisit = []
browser.get('https://cs99.salesforce.com/00eX00000000XXX?s=ObjectsAndTabs')
objtable = browser.find_element_by_css_selector("[id$='table_objects_and_tabs:tb']")
rows = objtable.find_elements_by_xpath('.//*')
for row in rows:
    tds = row.find_elements_by_tag_name("td")
    if len(tds) > 2:
        td1 = tds[0]
        td2 = tds[1]
        td2span = td2.find_element_by_tag_name("span")
        if td2span is not None and td2span.text != '--':
            td1span = td1.find_element_by_tag_name("span")
            if td1span is not None:
                td1a = td1span.find_element_by_tag_name("a")
                if td1a is not None:
                    td1aText = td1a.text
                    td1aLink = td1a.get_attribute('href')
                    td1aEditableLink = td1aLink.replace('?s=ObjectsAndTabs','/e?s=ObjectsAndTabs')
                    editableLinksToVisit.append({'text':td1aText, 'editLink':td1aEditableLink})
Enter fullscreen mode Exit fullscreen mode

Here's what the code above does:

  1. The browser.get() command made me visit a new web site in Chrome -- as if I myself had typed that URL into my browser bar and hit "enter."
    • It's pretty trippy -- you literally watch your computer do things without you, like someone else has taken over control.
  2. The URL I told browser.get() about is the URL of the "objects and tabs" screen of setup for the profile I was trying to edit (which I've written here as having an object ID of 00eX00000000XXX).
  3. When I run browser.get(), the contents stored in my program's browser variable change.
  4. Then I use various methods built into the data type I've stored in browser (I forget exactly what it is) such as .find_element_by_css_selector() and .find_element_by_tag_name() to scrape the contents of this particular web page.
    • Technically, I'm scraping the DOM, or the thing you see when you use the Firefox or Chrome developer console "inspect element" tool, not scraping the HTML source. Their contents are ever-so-slightly different. It used to be that you could tell what HTML was making up a page just by looking at its source. These days, JavaScript can "inject" HTML into the "DOM" at the last minute, and your browser actually renders a web page based on the DOM, so you have to know how to check what's really in the DOM if you want to parse the code behind a web page.
  5. My big for loop and nest of if statements looks for links in a specific table on the page that are not next to the text "--."
    • "--" would indicate that a link goes to settings for a Tab, not an Object, so I want to skip those.
  6. Then I grab the name of the object (per the link's text) and its URL.
  7. Finally, I edit the link by sticking an /e right before the ?s= so I can go straight to the editable version of that object's profile-specific configuration page when I visit my link.

My big list of object names and their links is what I store as editableLinksToVisit.

Step 3: Click all the buttons

I again commented out the code above and, without letting my Python IDE clear out the value any variables (which is normal behavior of my IDE from one "run" button click to the next), grabbed a cup of water and ran the following code (it took about 15-25 minutes):

for linkDict in editableLinksToVisit:
    print(linkDict.get('text'))
    browser.get(linkDict.get('editLink'))

    submitbutton = browser.find_element_by_css_selector("input[type='submit'][value='Save'][id$='button_pc_save']")
    editinputs = browser.find_elements_by_css_selector("input:enabled:checked[id$=':fls_edit_ck']")
    readinputs = browser.find_elements_by_css_selector("input:enabled:checked[id$=':fls_read_ck']")

    if submitbutton is not None and (len(editinputs) + len(readinputs) > 0):
        print('submit ' + str(submitbutton is not None))
        print('edits ' + str(len(editinputs)))
        print('reads ' + str(len(readinputs)))
        for ipt in editinputs:
            ipt.send_keys(" ")
        for ipt in readinputs:
            ipt.send_keys(" ")
        submitbutton.send_keys("\n")
        print('done')
    else:
        print('oops')
        print('submit ' + str(submitbutton is not None))
        print('edits ' + str(len(editinputs)))
        print('reads ' + str(len(readinputs)))
Enter fullscreen mode Exit fullscreen mode

This code is a big loop over all the description-URL link pairs (tuples) I'd saved into editableLinksToVisit.

Here's what it does for each link in the list:

  1. It writes down a note I can read telling me what object it's supposed to be about to hijack my browser to visit the URL of
  2. It hijacks my browser and visits that URL.
  3. It reads the code of the page's DOM to find all of the field-level security boxes that are currently checked and that are un-checkable.
    • It does this into separate "edit" and "read" checkbox lists.
  4. It reads the code of the page's DOM to find the Save button.
  5. If it found anything that "could use un-checking," it loops over the lists and un-checks the boxes.
    • I loop over "edits" first, then "reads," since un-checking "read" checkboxes can have a side effect of un-checking "edit" sideboxes and I wanted to keep things simple and not worry about side effects.
    • I use an imitation of hitting the spacebar after "tabbing" to a checkbox with the keyboard, rather than Selenium's .click() command, because .click() wasn't working reliably and someone on StackOverflow said this might and they were right. 🤷
  6. It clicks the Save button (again, I simulate this by virtually hitting "enter" on the keyboard rather than by clicking because for some reason that worked more reliably).

It's a bit mesmerizing to watch (note -- animation below):

For a few minutes, anyway.

Pro tip: Don't start it running when you're already staying late at the office and really just want to shut down your computer and enjoy a beautiful evening. 🌅

Step 4: Clean up with Data Loader

CampaignMember and a few other objects have editable permissions but didn't show up as editable through the web panel, so I found them by extracting this query with the Salesforce Data Loader:

SELECT Id, Field, SObjectType, PermissionsRead, PermissionsEdit
FROM FieldPermissions 
WHERE parentId IN ( SELECT id 
                    FROM permissionset 
                    WHERE PermissionSet.Profile.Name = 'Pardot Data Sync')
AND (PermissionsRead = true or PermissionsEdit = true)
Enter fullscreen mode Exit fullscreen mode

I used Excel to flip "PermissionsRead" and "PermissionsEdit" to FALSE all the way down both columns.

Then, in two passes -- again, doing "edit" first and "read" second (that is, I mapped "Id" both times but only mapped one column I wanted to edit at a time), with my settings to load just 1 record at a time (so a single error wouldn't shut down a batch), I passed this spreadsheet into Update operations in Data Loader.

That also took a few minutes (again, remember, 1 record at a time), but eventually it went through.


Not A Production Solution

This is not sustainable as a long-term solution to creating a "zero" profile.

Loading through the Metadata API is so much more kosher and can probably be done all in one step, instead of having to muck around with the Data Loader for "cleanup."

Plus, I'm not even really sure I did get every field-level security setting that I could've turned off.

I'd like to be more thorough in the future and truly bulk-edit permissions the way I'm supposed to.

  • Q: Anyone have any tips for building a perfect list of objects and fields to write out permissions for into a .profile file?
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .