Retrieving curated, labeled datasets for ML model training and evaluation.
A core strength of Melodi lies in its ability to transform real-world user interactions, reviewed and organized by product and customer experts, into valuable, structured datasets ready for Machine Learning (ML) tasks.
While product managers use Melodi to understand user experience and identify trends, the platform simultaneously curates production data in an ML-friendly format. This curated data, enriched with feedback, issue tags, and user intent classifications, becomes a powerful resource for data science and ML engineering teams.
Instead of relying solely on synthetic data or manually scraped examples, teams can leverage Melodi to access high-quality, production-grade data for:
Melodi’s data model stores interactions as “Threads,” which contain sequences of “Messages.” This structure, inspired by formats like OpenAI’s but adapted for broader analytics, inherently supports multi-turn conversations and includes distinct message types (e.g., user, assistant, tool calls, RAG lookups). This organization makes it straightforward to parse and utilize conversation data for various ML techniques.
Data scientists can retrieve precisely the data they need using Melodi’s API or, more conveniently, the Python SDK. The SDK provides powerful filtering capabilities through the ThreadsQueryParams
class, allowing you to slice and dice your production data effectively.
ThreadsQueryParams
)Here are some of the most useful parameters available when fetching threads via the SDK:
projectId
: Filter threads belonging to a specific project.hasFeedback
: Retrieve only threads that have associated feedback/labels.intentIds
: Get threads classified with specific User Intent IDs.issueIds
: Get threads tagged with specific Issue IDs.userSegmentIds
: Filter threads associated with users belonging to particular segments.search
: Perform a text-based search across message content within threads.before
/ after
: Filter threads based on creation time.externalIds
: Retrieve specific threads using your own external identifiers.ids
: Retrieve specific threads using Melodi’s internal IDs.includeFeedback
/ includeIntents
/ includeIssues
: Optionally include associated feedback, intent, or issue objects directly in the thread response payload (can increase response size).pageSize
/ pageIndex
: Control pagination for large datasets.This granular filtering enables the creation of highly specific datasets tailored to the exact ML problem you’re tackling.
Here’s a concise example of using the Python SDK to fetch threads from a specific project that have feedback:
By connecting product insights with powerful data retrieval tools, Melodi empowers both product and ML teams to collaborate effectively in improving AI agent performance based on real-world usage.
Retrieving curated, labeled datasets for ML model training and evaluation.
A core strength of Melodi lies in its ability to transform real-world user interactions, reviewed and organized by product and customer experts, into valuable, structured datasets ready for Machine Learning (ML) tasks.
While product managers use Melodi to understand user experience and identify trends, the platform simultaneously curates production data in an ML-friendly format. This curated data, enriched with feedback, issue tags, and user intent classifications, becomes a powerful resource for data science and ML engineering teams.
Instead of relying solely on synthetic data or manually scraped examples, teams can leverage Melodi to access high-quality, production-grade data for:
Melodi’s data model stores interactions as “Threads,” which contain sequences of “Messages.” This structure, inspired by formats like OpenAI’s but adapted for broader analytics, inherently supports multi-turn conversations and includes distinct message types (e.g., user, assistant, tool calls, RAG lookups). This organization makes it straightforward to parse and utilize conversation data for various ML techniques.
Data scientists can retrieve precisely the data they need using Melodi’s API or, more conveniently, the Python SDK. The SDK provides powerful filtering capabilities through the ThreadsQueryParams
class, allowing you to slice and dice your production data effectively.
ThreadsQueryParams
)Here are some of the most useful parameters available when fetching threads via the SDK:
projectId
: Filter threads belonging to a specific project.hasFeedback
: Retrieve only threads that have associated feedback/labels.intentIds
: Get threads classified with specific User Intent IDs.issueIds
: Get threads tagged with specific Issue IDs.userSegmentIds
: Filter threads associated with users belonging to particular segments.search
: Perform a text-based search across message content within threads.before
/ after
: Filter threads based on creation time.externalIds
: Retrieve specific threads using your own external identifiers.ids
: Retrieve specific threads using Melodi’s internal IDs.includeFeedback
/ includeIntents
/ includeIssues
: Optionally include associated feedback, intent, or issue objects directly in the thread response payload (can increase response size).pageSize
/ pageIndex
: Control pagination for large datasets.This granular filtering enables the creation of highly specific datasets tailored to the exact ML problem you’re tackling.
Here’s a concise example of using the Python SDK to fetch threads from a specific project that have feedback:
By connecting product insights with powerful data retrieval tools, Melodi empowers both product and ML teams to collaborate effectively in improving AI agent performance based on real-world usage.