Extending Superset SSO OAuth Provisioning

Ian Muge
6 min readJul 26, 2024

Premise

Our Data Engineering team recently requested a POC and framework for deploying a business intelligence tool, as they wanted something more performant and cost-effective in their toolbox. We love open-source, or better still open-source with a large supportive community. Part of the requirements were:

  • Open-source
  • Cost-effective
  • Modular and extensible
  • Easy integration with a data warehouse
  • Low or no-code operation (to enable easier buy-in from less technical users)
  • Secure (RBAC, fine-grained permissions, User provisioning from an SSO provider)
  • Stability and Performance

We performed the research and POCs, comparing several platforms but Superset emerged as the clear victor. Their code and helm charts are well documented and easy to follow through but in our case, we needed to go a bit more in-depth to extend it. We needed to fiddle with the OAuth provisioning to fit our particular use case, which we shall discuss here in case anyone else would find it useful.

Dive-in

The Superset helm chart provides a myriad of configuration options. Combined with the platform’s documentation and modularity, accomplishing our exact use case was quite an interesting challenge.

The challenge could be broken down into:

  • Add a new OAuth provider that is not part of the built-in options
  • Ensure we pass roles from the OAuth provider that shall be mapped to internal superset roles and that the roles are updated on subsequent user logins.
  • Add a field that we can later use with templated queries with the Jinja context addon and as part of dynamic roles

Let us begin by extending the user model so that we can accommodate a new field to be passed from the SSO provider. We extend the model by adding a new `department` field that we shall later populate.

extraSecrets:
models.py: |
from flask_appbuilder.security.sqla.models import User
from sqlalchemy import Column, Integer, ForeignKey, String, Sequence, Table
from sqlalchemy.orm import relationship, backref
from flask_appbuilder import Model

class CustomUser(User):
__tablename__ = 'ab_user'
department = Column(String(256))

The alembic migration engine used by Superset is quite smart, given that we extend the model before the first installation, this shall be included directly in the migration of the user model and we do not need to fiddle with any more migrations henceforth, otherwise, we would have to rerun the migrations and edit the alembic versions to be inline with the new migrations by editing the bootstrap job run at startup.

Once we have extended the user now we can configure how we get the user info from the new SSO provider, with roles passed and an additional parameter called `department` that we have added above:

extraSecrets:
...

custom_sso_security_manager.py: |
import logging
from superset.security import SupersetSecurityManager
from models import CustomUser

class CustomSsoSecurityManager(SupersetSecurityManager):
def oauth_user_info(self, provider, response=None):
logging.debug("Oauth2 provider: {0}.".format(provider))
if provider == 'new_provider':
me = self.appbuilder.sm.oauth_remotes[provider].get('user/info').json()
logging.debug("user_data: {0}".format(me))
return {
'name' : me['name'],
'email' : me['email'],
'id' : me['sub'],
'username' : me['email'],
'first_name':me['given_name'],
'last_name':me['family_name'],
'department': me.get("
params", {}).get("Department",""),
"role_keys": me.get("groups", [])
}

First, we extend the `CustomSsoSecurityManager` class and overwrite how we get the user info given we are authenticating using our new SSO provider, here we are using the Flask appbuilder support functions to get the user info from the SSO provider’s `userinfo_endpoint` . We can see that we also captured the roles under role_keys and our new department field.

Next, we extend the actual authorization flow and use the new user model. We check the validity of the info captured from the `user_info`, create a new user if they didn’t exist, update user details if the user exists and finally assign them to an internal role that matches their department if the role has been pre-provisioned in Superset otherwise it shall be updated on the next login attempt.

extraSecrets:
...

custom_sso_security_manager.py: |
import logging
from superset.security import SupersetSecurityManager
from models import CustomUser

class CustomSsoSecurityManager(SupersetSecurityManager):
user_model = CustomUser
...
def auth_user_oauth(self, userinfo):
"""
Method for authenticating user with OAuth.
:userinfo: dict with user information
(keys are the same as User model columns)
"
""
# extract the username from `userinfo`
if "username" in userinfo:
username = userinfo["username"]
elif "email" in userinfo:
username = userinfo["email"]
else:
logging.error("OAUTH userinfo does not have username or email {0}".format(userinfo))
return None
# If username is empty, go away
if (username is None) or username == "
":
return None

# Search the DB for this user
user = self.find_user(username=username)
# If user is not active, go away
if user and (not user.is_active):
return None
# If user is not registered, and not self-registration, go away
if (not user) and (not self.auth_user_registration):
return None
# Sync the user's roles
if user and self.auth_roles_sync_at_login:
user_role_objects = set()
user_role_objects.add(self.find_role(self.auth_user_registration_role))
role_keys=set(userinfo.get("
role_keys", []))
department=userinfo.get("department", "")

if department:
user.department=department
self.update_user(user)
if self.find_role(department):
user_role_objects.add(self.find_role(department))
for role_key, fab_role_names in self.auth_roles_mapping.items():
if role_key in role_keys:
for fab_role_name in fab_role_names:
fab_role = self.find_role(fab_role_name)
if fab_role:
user_role_objects.add(fab_role)
user.roles = list(user_role_objects)
logging.debug("Calculated new roles for user='{0}' as: {1}".format(username, user.roles))
# If the user is new, register them
if (not user) and self.auth_user_registration:
user_role_objects = set()
user_role_objects.add(self.find_role(self.auth_user_registration_role))
role_keys=set(userinfo.get("
role_keys", []))
department=userinfo.get("department", "")
if department and self.find_role(department):
user_role_objects.add(self.find_role(department))
for role_key, fab_role_names in self.auth_roles_mapping.items():
if role_key in role_keys:
for fab_role_name in fab_role_names:
fab_role = self.find_role(fab_role_name)
if fab_role:
user_role_objects.add(fab_role)
user = self.add_user(
username=username,
first_name=userinfo.get("first_name", ""),
last_name=userinfo.get("last_name", ""),
email=userinfo.get("email", "") or f"{username}@email.notfound",
role=list(user_role_objects),
)
if department:
user.department=userinfo.get("department", "")
self.update_user(user)
logging.debug("New user registered: {0}".format(user))
# If user registration failed, go away
if not user:
logging.error("
Error creating a new OAuth user {0}".format(username))
return None
# LOGIN SUCCESS (only if user is now registered)
if user:
self.update_user_auth_stat(user)
return user
else:
return None

We have to account for some cases and code paths, all while catching some error conditions. The simplified flow chart is as follows:

We set up our new SSO provider and configured the base role mapping to be applied in line with built-in Superset roles. We do a bit more configurations for example: allowing user registration and syncing roles at login

configOverrides:
enable_oauth: |
from flask_appbuilder.security.manager import (AUTH_DB, AUTH_OAUTH)
AUTH_TYPE = AUTH_OAUTH
OAUTH_PROVIDERS = [
{
"name": "new_provider",
"whitelist": [ os.getenv("OAUTH_WHITELIST_REGEX", "") ],
"icon": "fa-key",
"token_key": "access_token",
"remote_app": {
"client_id": os.environ.get("CLIENT_ID"),
"client_secret": os.environ.get("CLIENT_SECRET"),
"server_metadata_url": "https://provider.com/oidc/2/.well-known/openid-configuration",
"client_kwargs": {"scope": "openid name profile groups email params"},
"api_base_url": "https://provider.com/oidc/2/",
"authorize_params": {
"hd": os.getenv("OAUTH_HOME_DOMAIN", ""),
"redirect_uri" : os.getenv("OAUTH_REDIRECT_URL")},
}
}
]
# Map Authlib roles to superset roles
AUTH_ROLE_ADMIN = 'Admin'
# Will allow user self registration, allowing to create Flask users from Authorized User
AUTH_USER_REGISTRATION = True
# The default user self registration role
AUTH_USER_REGISTRATION_ROLE = "Gamma"
AUTH_ROLES_SYNC_AT_LOGIN = True
AUTH_ROLES_MAPPING = {
"MetricsUser": ["Gamma", "sql_lab"],
"MetricsOwner": ["Alpha", "sql_lab"],
"MetricsAdmin": ["Admin"]
}

Finally, we also extend the Jinja context with our new `department` field so that we can later use it in SQL templates.

configOverrides:
extend_oauth: |
from custom_sso_security_manager import CustomSsoSecurityManager
CUSTOM_SECURITY_MANAGER = CustomSsoSecurityManager

def current_user_department():
return g.user.department or ""

JINJA_CONTEXT_ADDONS = {
'current_user_department': current_user_department
}

TL;DR

This should help configure a new SSO provider, add custom fields to the user, authenticate and authorise the user, and perform dynamic role assignments based on a parameter passed from the SSO provider.

This is built on the very impressive documentation offered by superset but needed some finagling to make it work so this should help anyone to save a few hours instead of digging through mountains of docs or digging into the FAB or Superset source code.

References

#StayLazy

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Ian Muge
Ian Muge

Written by Ian Muge

If I have to do it more than twice, I am automating it. #StayLazy

No responses yet

Write a response