Apple's Latin America (ALAC) AI/ML team is looking for a rare combination: someone who is as comfortable discussing sales channel dynamics and business process logic as they are designing graph schemas and data pipelines.
As an AIML Data Quality and Governance Scientist, your core work will be to listen deeply to business experts, extract and formalize what they know, and encode it into Enterprise Knowledge Graphs that become the trusted foundation
for AI agents, analytical tools, and decision-support systems used by sales teams across ALAC.
This is not a role that works in isolation. The knowledge graphs you build must fit coherently into a broader, interconnected enterprise knowledge ecosystem - and your definitions, ontologies, and data models must be consistent with and linkable to graphs owned by partner teams, so that the sales user always gets a single, coherent, trustworthy picture.
You will also lead data quality and governance efforts -translating the CoE Lead's governance strategy into documented, enforceable policies and operational standards across ALAC's data assets - and design ingestion architectures for unstructured, structured, and partner/channel/third-party data sources. Regular status and impact reporting to leadership on data quality trends and governance progress will be part of your ongoing rhythm.
Description
Business Knowledge Capture & Translation
- Engage deeply with business stakeholders across sales, channel operations, and analytics -in close collaboration with the AI Business Process Champion- to extract domain expertise and bridge how the business thinks about its data with how AI systems need that knowledge structured
- Translate business concepts, sales processes, channel structures, and operational logic into precise, formal ontologies and knowledge graph schemas
- Codify metric definitions, calculation rules, and business hierarchies in a way that is unambiguous, governed, and accessible to both humans and AI systems
Enterprise Knowledge Graph Development
- Design, build, and continuously evolve enterprise knowledge graphs that represent ALAC's business entities, data assets, metrics, channels, and processes in a machine-readable, AI-consumable form - maintaining backward compatibility and communicating schema changes to downstream consumers
- Ensure knowledge models are architecturally consistent with and linkable to enterprise knowledge graphs owned by global partner teams - your graph is a regional node in a larger connected ecosystem
- Incorporate partner, channel, and third-party data sources into the knowledge graph - ensuring external data is governed, traceable, and integrated with the same rigor as internal assets
- Map end-to-end data lineage from raw source to derived metric, consuming the AI Business Process
- Champion's process maps as primary inputs when modeling business flows and interdependencies
Data Quality & Governance
- Build and operationalize data quality monitoring pipelines - defining rules, anomaly detection, and drift analysis to surface issues before they reach AI outputs or sales-facing tools
- Document and enforce data governance policies and procedures - translating the CoE Lead's governance strategy into operational rules, standards, and enforcement mechanisms, with quality checks embedded at ingestion, transformation, and serving layers
- Optimize existing data processes and workflows for efficiency - actively improving how data flows through the organization, not just monitoring quality
- Establish data quality SLAs, dashboards, and regular leadership reporting on governance progress, data quality trends, and alignment with the business roadmap
Unstructured Data Architecture
- Propose and design ingestion pipelines for unstructured content - market reports, channel briefings, operational documents - making them first-class citizens in the knowledge graph
- Apply embedding models, entity extraction, and classification techniques to transform unstructured inputs into structured, linked knowledge - ensuring AI tools can reason with equal confidence over structured and unstructured sources
AI Enablement for Sales Users
- Design retrieval architectures - including vector search and graph traversal - aligned to the CoE Lead's solution architecture, and build evaluation pipelines to measure how accurately AI agents consume ALAC-specific knowledge
- Collaborate with global AI platform and engineering teams to ensure ALAC knowledge is correctly integrated and surfaced in shared AI tools and agents
- Keep the sales user experience as the north star - every knowledge modeling decision should make it easier for a sales team member to get a fast, accurate, trustworthy answer
Cross-functional & Global Collaboration
- Collaborate actively with regional and global teams spanning analytics, AI platforms, technology infrastructure, sales operations, and channel management - navigating a complex, multi-stakeholder environment with clarity and credibility
- Align ALAC knowledge models, ontology choices, and metric definitions with global standards - and influence upstream data model design so new data assets are born with the quality and semantic richness the knowledge graph requires
- Establish and communicate a clear roadmap for ALAC's knowledge graph and data quality initiatives, managing expectations across regional and global stakeholders
Preferred Qualifications
Experience working within a federated enterprise knowledge graph ecosystem - building graphs designed to interoperate with graphs owned by other teams
Background in sales, channel operations, or business process domains
Experience integrating knowledge graphs with LLM-based agents and RAG pipelines
Familiarity with data governance frameworks (DAMA-DMBOK or equivalent) and experience managing governance projects from planning through execution
Experience with partner, channel, or third-party data sets - ingestion, quality assessment, and integration into governed environments
Experience leveraging AI/ML and automation to scale governance processes and improve data quality programmatically
Spanish proficiency
Advanced degree, MS or Ph.D., a plus
Minimum Qualifications
6+ years in Data Science, Knowledge Engineering, or a related field - with demonstrated experience extracting complex business concepts and translating them into formal ontologies, schemas, or knowledge graph structures
Hands-on experience designing and building knowledge graphs (e.g., Neo4j, GraphDB, Amazon Neptune) in a production environment
Proficiency in graph query languages (Cypher, SPARQL, Gremlin), semantic standards (RDF, OWL, SHACL), and metadata management tools (e.g., Alation, Collibra, DataHub)
Strong experience with data quality frameworks - rule definition, monitoring pipelines, anomaly detection, and drift analysis
Experience designing ingestion pipelines for unstructured data as well as ETL/integration workflows for structured data across internal and external systems
Proficiency in SQL, Python, and at least one major data platform (Snowflake, Spark, or Hadoop)
Familiarity with vector similarity search, embedding models, and RAG architectures
Strong communication and stakeholder management skills - comfortable influencing across regional and global teams without direct authority
Bachelor's degree in Computer Science, Information Science, Statistics, Engineering, or a related field
Fluent in English and Portuguese