Looking up users

Ability to get all the users (or groups) from an ISV tenant using the REST API calls.

Introduction

Managing large user or group populations can be a daunting task. The REST API's provided by ISV balance performance with usability, which means a given search for a set of users can at most return 2500 users at a time.
This article is intended to provide an algorithm to get all users (groups) from the tenant, and show an implementation of that algorithm in javascript and nodejs.

Pre-requisites

To get started with this, you need an ISV tenant, a large number of users (over 2500), and be familiar with the REST API calls.
Obtaining an access token and making REST calls for user searches (GET operation on /v2.0/Users) leveraging the SCIM based filters should be given.

Algorithm

The algorithm we are showing here will leverage the fact that every user created within the ISV registry, has a created date. In other words we know that all users exist within an order list based on that created date.
One important fact of this list is that there can be users that have the same created date, in other words any user created within the same second has the same date.

With this fact, we can now perform an ordered search (order by created date), and simple do searches for various chunks.
One complication is how to find the start / end of the chunks of data. We define an initial end for each chunk by simply asking for 2000 users.

The first start is simple, we just start somewhere in the past before our ISV tenant existed.
Finding the second start is more complicated, because we need to find a new starting point in such a way, that we do not overlap, in other words we need a created date that is not shared by users across the breaks.

So we basically have the following algorithm for each page search:

  1. Define a full User List as empty
  2. Perform a user search, ordered by created date, starting with a given date SD (initially some date in the past) and return 2000 users.
  3. When the search returns less then 2000 users we know we have reached the end
    2.a Add the users returned to the full User List and you are done
  4. The search has returned 2000 users, which means we have more users to look for.

If we are looking at the end of the list we notice the following, that the last two entries have the exact same time stamp. In fact for any return we have the following considerations or possibilities:

  1. The end has many users with the same date
  2. There are more users with the same date that we missed (in other words the next user after jane, also has the same date)
    In other words, we cannot simply state to start after janes created date, we have to find a save break.

Looking at the end of the list above, we want to start searching with the created date by jerry. This would mean we get jerry and jane again so remove them from the list.
Here is a graphical way to think of this:

The algorithm here is:

  1. As long as the last and next to last have the same created date, drop the last one (we may drop many)
  2. Now that the last and next to last have different dates do:
  3. Record the last created Date as the new Starting date for the next search
  4. Drop the last one as it will be the first one in the next search.

After this, we will add the remaining list to our full User List. And repeat the search with the new start date, repeating the process for the next page of data.

It is to note, so for each page we get 2000 users but will always add a few less to the overall list, in order to find a "real" breaking point.

Code Example

The following code snippets show the implementation of the described algorithm in javascript (nodejs).

The first method is the starting point which sets up the required variables and starts a loop until we reach the end of our list.
Note that the method allows for three input parameters:
filter is an optional scim based filter, which can be used to refine the user list returned, an example for a filter is: 'userName sw "a"'
stats is an object that contains statistical information
grps is a flag indicating if we also return the groups a user is a member of

// get all users by filter
  async getAllUserbyFilter(filter, stats, grps=false) {
    // the following object structure contains the information we need for each loop
    // filter is the filter passed in
    // more will be true until we read the last page (aka less then 2k users)
    // masterlist is the overall result list
    // block is the starting point for a search
    let res = {
      filter: filter,
      more: true,
      masterlist: [],
      block: '2000-01-01T00:00:00Z'
    }

    // We keep searching and adding to the master list until more is false
    // we pass our control structure, stats and the grps flag
    while ( res.more ) {
      await this.userpage(res, stats, grps);
    }

    // return the resulting list
    return res.masterlist;
  }

The following method does the actual work on getting a "page" and setting the variables for the next page.

async userpage(res, stats, grps=false) {
    // increment and log our page count, also display a message to the screen showing we are working !
    stats.page++;
    log.debug('PAGE: ', stats.page);
    console.log('Page: ', stats.page);
    // we need a masterfilter
    let mf;
  
    // check if filter is empty
    // build the masterfilter we will use for the actual search
    // if an additional filter is provided, add it 
    // build a filter based on the current starting point, order by date, ascending, and return 2k users
    if (res.filter == '') {
      mf = 'meta.created gt \"' + res.block + '\"&sortBy=meta.created&sortOrder=ascending&count=2000';
    } else {
      mf = '((meta.created gt \"' + res.block + '\") and (' + res.filter + '))&sortBy=meta.created&sortOrder=ascending&count=2000';
    }

    // Make a call to the actual rest api to get a list of users based on the master filter, passing grps
    let list = await this.getUserbyFilter(mf, grps);
  
    // now also check make sure we have some groups or an error
    if (list == null) {
      console.log('Failure in searching!');
      process.exit();
    }

    // We now check if the list returned is less then 2000!
    if (list.totalResults < 2000) {
      // we have no more searches to do, so set more to false
      res.more = false;
    } else {
      // we have more searches to do so find a new starting point!
      // remember that arrays are 0 based so the last element is length-1, etc.
      // the following while loop simply compares last to last-1 and if they are the same pop it
      while (list.Resources[list.Resources.length-1].meta.created == list.Resources[list.Resources.length-2].meta.created) {
        list.Resources.pop();
        //log.trace('>>>>',list.Resources[list.Resources.length-1].meta.created, ' == ', list.Resources[list.Resources.length-2].meta.created )
        //log.trace('>>>>Length: ', list.Resources.length);
      }
      // we are now in the situation that the last and last-1 are different 
      // we need to pop one more and record the new starting block
      res.block = list.Resources[list.Resources.length-1].meta.created;
      list.Resources.pop();
    }
    
    // add the number of users in the remaining list to the total count and log the page count
    stats.got += list.Resources.length;
    log.debug('Page Count: ', list.Resources.length);
    
    // combine with the master list and now log the overall count
    res.masterlist = [...res.masterlist, ...list.Resources];
    log.trace('masterlist size: ', res.masterlist.length);
  }

The above code snippets are written for users but can easily be adjusted for groups.

💎

Martin Schmidt, IBM Security