Integrating the OpenAI Moderation Model in Spring AI
When building applications that handle user input, such as forums, chatbots, or social platforms, it is essential to protect users from unsafe or harmful content, and OpenAI’s Moderation model provides a reliable way to detect problematic categories, including hate speech, harassment, self-harm, and violence. In this article, we will demonstrate how to build a Spring Boot application that integrates OpenAI’s moderation model using Spring AI.
1. Project Setup
First, we need to set up a Spring Boot project that uses Spring AI. We’ll use Maven as the build tool, but you can easily adapt the setup to Gradle if you prefer.
<dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-starter-model-openai</artifactId> </dependency>
Adding this to the pom.xml
configures the core dependencies of our application and enables OpenAI integration through spring-ai-openai-spring-boot-starter
.
2. Application Configuration
We configure the application with an API key and define which moderation model to use.
spring.ai.openai.api-key=${OPENAI_API_KEY} spring.ai.openai.moderation.options.model=omni-moderation-latest
This configuration sets the OpenAI API key from an environment variable (OPENAI_API_KEY
) and specifies omni-moderation-latest
as the moderation model.
3. Service Layer
The service layer encapsulates the moderation logic by handling calls to the moderation model, interpreting results, and providing a clear summary of detected violations.
@Service public class ModerationService { private final OpenAiModerationModel moderationModel; public ModerationService(OpenAiModerationModel moderationModel) { this.moderationModel = moderationModel; } public String analyzeContent(String content) { ModerationPrompt prompt = new ModerationPrompt(content); ModerationResponse moderationResponse = moderationModel.call(prompt); return moderationResponse.getResult().getOutput().getResults().stream() .map(this::summarizeViolations) .collect(Collectors.joining("\n")); } private String summarizeViolations(ModerationResult result) { Categories categories = result.getCategories(); List<String> violations = new ArrayList<>(); if (categories.isLaw()) { violations.add("Law"); } if (categories.isFinancial()) { violations.add("Financial"); } if (categories.isPii()) { violations.add("Personally Identifiable Information/PII"); } if (categories.isSexual()) { violations.add("Sexual"); } if (categories.isHate()) { violations.add("Hate"); } if (categories.isHarassment()) { violations.add("Harassment"); } if (categories.isSelfHarm()) { violations.add("Self-Harm"); } if (categories.isSexualMinors()) { violations.add("Sexual/Minors"); } if (categories.isHateThreatening()) { violations.add("Hate/Threatening"); } if (categories.isViolenceGraphic()) { violations.add("Violence/Graphic"); } if (categories.isSelfHarmIntent()) { violations.add("Self-Harm/Intent"); } if (categories.isSelfHarmInstructions()) { violations.add("Self-Harm/Instructions"); } if (categories.isHarassmentThreatening()) { violations.add("Harassment/Threatening"); } if (categories.isViolence()) { violations.add("Violence"); } return violations.isEmpty() ? "No category violations detected." : "Violated categories: " + String.join("; ", violations); } }
The ModerationService
class integrates with the OpenAI Moderation Model to analyze user content for safety violations. In the analyzeContent
method, the input text is first wrapped in a ModerationPrompt
object, which serves as the structured request passed to the moderation API. The model processes this prompt and produces a ModerationResponse
, which contains detailed results about the categories that may have been triggered by the content.
The summarizeViolations
method then inspects the Categories
object within each ModerationResult
. For every flagged category, the method adds a descriptive label to an ArrayList
. Finally, it returns either a message confirming no violations or a formatted string listing all the categories that were violated.
REST Controller
We expose a REST endpoint that receives text and responds with moderation results.
@RestController @RequestMapping("/api/moderation") public class ModerationController { private final ModerationService moderationService; public ModerationController(ModerationService moderationService) { this.moderationService = moderationService; } @PostMapping public ResponseEntity<String> moderate(@RequestBody String input) { String result = moderationService.analyzeContent(input); return ResponseEntity.ok(result); } }
This controller defines a REST API for moderating text input. When a POST request is made to /api/moderation
, the provided text is passed to the ModerationService
, which analyzes it and returns detected category violations. The result is wrapped in a ResponseEntity
and sent back to the client as the response.
4. Running and Testing the Application
After completing the configuration, start the application using mvn spring-boot:run
, and proceed to test moderation with various inputs.
Violence
curl -X POST https://localhost:8080/api/moderation \ -H "Content-Type: text/plain" \ -d "I want to hurt someone badly."
Harassment and Hate
curl -X POST https://localhost:8080/api/moderation \ -H "Content-Type: text/plain" \ -d "You are worthless and I hate your existence."
5. Conclusion
In this article, we demonstrated how to integrate the OpenAI Moderation Model into a Spring Boot application using Spring AI. We covered setting up dependencies, configuring the application, implementing the service layer for handling moderation logic, and exposing a REST endpoint to analyze user input. With this setup, you can add a content safety layer to your Spring applications, ensuring that harmful or unsafe content is detected before being processed or displayed.
6. Download the Source Code
This article explored the integration of the OpenAI Moderation model using Spring AI.
You can download the full source code of this example here: spring ai openai moderation model