A renowned AI platform reached out to us to enhance its existing ML models. Their primary objective was to filter out spam content, hateful speech, and misinformation from the model.
Given the large influx of data they regularly handled, they sought to collaborate with result-oriented AI-ML experts, and thus, they chose Macgence. Specifically, they were looking for effective solutions that could:
- Detect hate speech, misinformation, and spam across multiple domains each month.
- Provide highly skilled labelers with a deep understanding of local cultural norms and events.
- Additionally, ensure labelers were fluent in multiple languages, including English, Spanish, French, Mandarin, Italian, Japanese, Arabic, Portuguese, Turkish, and German.
Smooth Execution
Following is the roadmap of the steps we followed to cater to the requirements of our clients.
- Creating a Specialized Data Labelling Team
- Due to our customer’s unique and advanced criteria for assessing misinformation and spam, we created a custom labeling team. Each of the members of the labeling team was an expert in their field.
- To meet these requirements, a total of 30 teams were created, each specializing in different domains and languages. As a result, these teams continuously grew over time, working relentlessly to deliver more than 1.5 million labels per week.
- Along with our labeling interface, we could easily meet our customer’s specific data collection requirements. We included multiple-choice questions, free responses, checkboxes, NER tagging, conditional logic, and more options in the model to meet their requirements.
- Assigning a Dedicated Project Coordinator
- For transparent and timely communication with our clients, we assigned them a dedicated project coordinator. Our team had meetings with the client regularly to receive feedback and improve their experience at Macgence.
- Our dedicated project coordinator added a security layer to the process by performing a quality check of the data before sending it to the client. This way, the majority of the errors were dissolved at our end resulting in a smoother client experience.
- Even our client was quite happy with our decision to appoint a dedicated project coordinator as they were able to communicate their ideas quite clearly and also they got a prompt response to their queries.
- The client even commented that our project coordinator could understand their project better than they did because they had hands-on experience.
Results
The customer enjoyed working with us as they got impeccable results in minimum time. Macgence was successful in boosting the Area Under Cover of our client’s ML models by 60% which is a huge achievement in itself. They have even doubled the number and tripled the quality of their datasets, and have increased the speed of their data pipelines by 15 times.
Our client was quite happy with the results. They were impressed with our labelers which according to them were more effective in identifying misinformation than other fact-checkers.
We emerged successful as our client has received over 55 million high-quality labels over the last year ranging from hate speech to misinformation to spam.
Applications of Content Moderation
Training Data Quality
Content moderation ensures the quality of training data for AI/ML models by filtering out irrelevant, incorrect, or biased data.
Bias Detection and Mitigation
AI-driven content moderation identifies and mitigates biases in training datasets.
Toxic Content Filtering
Moderation tools automatically filter out toxic content from training data. This is crucial for developing AI/ML models.
Spam and Irrelevant Data Removal
AI content moderation removes spam and irrelevant data from training datasets. This enhances the efficiency of AI/ML models.
The Macgence Way
TAT
Consequently, Compliant high-quality data is available at your disposal that comes with benefits of customization as well that can be quickly delivered
QUALITY
Our dataset goes through rigorous 2-level quality checks before delivery
COMPLIANCE
Moreover, We Adhere to both the mandatory compliances of HIPAA & GDPR
ACCURACY
Additionally, We Provide ~98% accuracy across different annotation types and model datasets
NO. OF USE CASES SOLVED
Also, We have Experience across a diverse range of use cases