validatedpatterns · dminnear-rh · Jun 4, 2026 · Jun 3, 2026 · Jun 4, 2026
diff --git a/content/patterns/maas-quickstart/_index.adoc b/content/patterns/maas-quickstart/_index.adoc
@@ -0,0 +1,39 @@
+---
+title: MaaS Code Assistant AI Quickstart
+date: 2026-06-03
+tier: sandbox
+summary: This pattern deploys a multi-tenant AI code assistant with NVIDIA Nemotron models, tiered rate limiting, and IDE integration on OpenShift.
+rh_products:
+  - Red Hat OpenShift Container Platform
+  - Red Hat OpenShift AI
+  - Red Hat OpenShift DevSpaces
+  - Red Hat Connectivity Link
+industries:
+  - General
+focus_areas:
+  - AI
+  - Code
+  - AI Quickstart
+aliases: /maas-quickstart/
+links:
+  github: https://github.com/validatedpatterns-sandbox/ai-quickstart-maas-code-assistant
+  install: getting-started
+  bugs: https://github.com/validatedpatterns-sandbox/ai-quickstart-maas-code-assistant/issues
+  feedback: https://docs.google.com/forms/d/e/1FAIpQLScI76b6tD1WyPu2-d_9CCVDr3Fu5jYERthqLKJDUGwqBg7Vcg/viewform
+---
+:toc:
+:imagesdir: /images
+:_content-type: ASSEMBLY
+include::modules/comm-attributes.adoc[]
+
+include::modules/maas-quickstart-about.adoc[leveloffset=+1]
+
+include::modules/maas-quickstart-architecture.adoc[leveloffset=+1]
+
+[id="next-steps-maas-quickstart"]
+== Next steps
+
+* link:getting-started[Install this pattern]
+* link:cluster-sizing[Cluster sizing]
+* link:customizing-this-pattern[Customizing this pattern]
+* link:troubleshooting[Troubleshooting]
diff --git a/content/patterns/maas-quickstart/cluster-sizing.adoc b/content/patterns/maas-quickstart/cluster-sizing.adoc
@@ -0,0 +1,29 @@
+---
+title: Cluster sizing
+weight: 30
+aliases: /maas-quickstart/cluster-sizing/
+---
+
+:toc:
+:imagesdir: /images
+:_content-type: ASSEMBLY
+include::modules/comm-attributes.adoc[]
+include::modules/ai-quickstart-maas-code-assistant/metadata-ai-quickstart-maas-code-assistant.adoc[]
+
+include::modules/cluster-sizing-template.adoc[]
+
+[id="maas-quickstart-gpu-node-requirements"]
+== GPU node requirements
+
+In addition to the worker nodes listed above, this pattern requires at least 2 GPU-equipped nodes for model inference. On AWS, the pattern automatically provisions `g6e.2xlarge` instances with NVIDIA L40S GPUs. On other providers and bare metal, GPU nodes must already be part of the cluster before deploying the pattern.
+
+.GPU node minimum requirements
+[cols="<,^,<,<"]
+|===
+| Cloud provider | Node type | Number of nodes | Instance type
+
+| Amazon Web Services
+| GPU Worker
+| 2
+| g6e.2xlarge
+|===
diff --git a/content/patterns/maas-quickstart/customizing-this-pattern.adoc b/content/patterns/maas-quickstart/customizing-this-pattern.adoc
@@ -0,0 +1,143 @@
+---
+title: Customizing this pattern
+weight: 20
+aliases: /maas-quickstart/customizing/
+---
+
+:toc:
+:imagesdir: /images
+:_content-type: ASSEMBLY
+include::modules/comm-attributes.adoc[]
+
+[id="customizing-maas-quickstart"]
+== Customizing the MaaS Code Assistant AI Quickstart pattern
+
+This pattern deploys an AI code assistant with tiered user access, rate limiting, and NVIDIA Nemotron model serving. You can customize the models, rate limit policies, user tiers, and IDE configuration.
+
+[id="changing-models-maas"]
+=== Changing models
+
+The pattern serves two models by default:
+
+* `nemotron-3-nano-30b-a3b-fp8` -- Available to premium and enterprise tier users.
+* `gpt-oss-20b` -- Available to all user tiers.
+
+To change or add models, edit the `models` list in `overrides/maas-quickstart.yaml`. The pattern pulls models from OCI registries and does not require a HuggingFace API token.
+
+The model definitions specify the model URI, resource requirements, GPU tolerations, and vLLM arguments. For example:
+
+[source,yaml]
+----
+models:
+  - name: gpt-oss-20b
+    displayName: OpenAI gpt-oss-20b
+    uri: oci://registry.redhat.io/rhelai1/modelcar-gpt-oss-20b:1.5
+    resources:
+      limits:
+        cpu: "4"
+        memory: 24Gi
+        nvidia.com/gpu: "1"
+      requests:
+        cpu: "2"
+        memory: 16Gi
+        nvidia.com/gpu: "1"
+    extraArgs:
+      - --enable-force-include-usage
+    tolerations:
+      - effect: NoSchedule
+        key: nvidia.com/gpu
+        operator: Exists
+----
+
+[NOTE]
+====
+Each model requires a GPU with at least 48 GB of VRAM. Adding models beyond the default two requires additional GPU nodes.
+====
+
+[id="adjusting-rate-limits-maas"]
+=== Adjusting rate limits and user tiers
+
+The pattern uses Kuadrant (Red Hat Connectivity Link) to enforce per-tier rate limits on inference requests. The default tiers and limits are:
+
+[cols="1,1,2",options="header"]
+|===
+| Tier | Rate limit | Description
+
+| Free
+| 5 requests per 2 minutes
+| Basic access for evaluation
+
+| Premium
+| 20 requests per 2 minutes
+| Standard production usage
+
+| Enterprise
+| 50 requests per 2 minutes
+| High-throughput workloads
+|===
+
+To adjust rate limits, modify the `tiers` section in `overrides/maas-quickstart.yaml`. The following example increases the premium tier request limit to 40 and the token limit to 20000:
+
+[source,yaml]
+----
+tiers:
+  premium:
+    users:
+      - premium-user
+    requestRates:
+      - limit: 40
+        window: 2m
+    tokenRates:
+      - limit: 20000
+        window: 1m
+----
+
+Push your changes to your forked repository so the GitOps framework applies the updated configuration.
+
+[id="managing-users-maas"]
+=== Managing users
+
+htpasswd with OpenShift OAuth handles user authentication. The default users are:
+
+* `admin` -- Full administrative access (enterprise tier)
+* `free-user` -- Free tier access
+* `premium-user` -- Premium tier access
+* `enterprise-user` -- Enterprise tier access
+
+{hashicorp-vault} and the {eso-op} store and manage user passwords in the `values-secret.yaml` file. To change a user password after initial deployment, update the secret value in your `values-secret.yaml` file and redeploy the pattern.
+
+To assign users to different tiers, modify the `tiers` section in `overrides/maas-quickstart.yaml`:
+
+[source,yaml]
+----
+tiers:
+  free:
+    users:
+      - free-user
+  premium:
+    users:
+      - premium-user
+      - user1
+  enterprise:
+    users:
+      - admin
+      - enterprise-user
+----
+
+[id="configuring-devspaces-maas"]
+=== Configuring OpenShift DevSpaces
+
+The pattern integrates the Continue AI extension in OpenShift DevSpaces to provide code assistance directly in the IDE. DevSpaces is preconfigured to clone the AI Quickstart repository and connect to the vLLM inference endpoints.
+
+To customize the DevSpaces configuration, you can adjust:
+
+* Default IDE settings and extensions
+* Resource limits for developer workspaces
+* The inference endpoint URL used by the Continue extension
+
+[id="gpu-node-provisioning-maas"]
+=== Provisioning GPU nodes
+
+This pattern requires at least 2 NVIDIA GPU nodes with 48 GB or more of VRAM each. On AWS, the pattern automatically provisions `g6e.2xlarge` GPU machine sets with NVIDIA L40S GPUs.
+
+If your cluster does not have GPU nodes, you must add them before you deploy the pattern. The pattern installs all required operators, including the NVIDIA GPU Operator, automatically during deployment.