We are looking for a full-time Data Scientist Voice/Search STT, TTS, STS who has 1-2 years of AI-experience
Manan AI was founded by Ph.D. CEO and consists of AI and digital media professionals with Stanford, Facebook, Google background, advised by the ex-partner of top-tear Silicon Valley VC. We build a strategic framework for the impact of disruption content, empower creators with applications of synthetic media AI-generators. All video and voice creation will be done via generative methods as decentralized Bloomberg & Hollywood on your mobile and laptop and can be used to make your creative content or interactive games, videos, virtual personas, movies, and immersive experiences. We imply cutting-edge ML/AI for S2T NLP, summarization, semantic search, speech recognition, language translation, NER, and synthetic media generation. A collaborative fiction with GPT2-3 (Open AI), Facebook AI to explore our common creator's humanity. Target: B2B, B2C, Entertainment, Marketing & Customer Service, Advertising, Security, and Privacy for Deep Fake.
We are a team of amazing millennial entrepreneurs, developers, and AI scientists that are working on solving the task of AI x Search x Editing x Share = Video, voices, and thoughts generated in collaboration with AI.
We work with data from a wide variety of sources including text, voice, news feeds, twits & stories, user behavior, and real-time data. The team has several US professors as senior advisors with world-class expertise in machine learning, statistics, optimization, and stochastic control who provide advice and mentorship for all members of the distributed team, research and development experience in real-world NLU/P, conversational AI, chatbots, dialog management, search, or voice/dialog systems. Your contributions will drive content discovery and personalization through voice/video/search/summary interactions across mobile apps, third-party devices (e.g. Alexa, Google Home, Roku, etc.), and automotive products.
Why Join Us:
We are a fully distributed team with a New York HQ. Flexible work with flexible schedule possible
You get to work on turning bleeding-edge research for generating voice and videos, deep fakes into commercial products
As part of the Search, and Voice Science team, you will design and build the next generation of voice/text/video and search experiences. As a Scientist, you will be an expert in areas spanning speech recognition, natural language processing and understanding, dialog management, personalization, natural language generation, and information retrieval
Open to candidates internationally, No-micromanagement environment for highly self-sufficient individuals
Powerful workstation with GPUs
Responsibilities and what we are looking for:
Developing cutting-edge ML for Speech to Text (STT), Text to Speech (TTS), Speech to Speech (STS), real-time speech synthesis (Clone, Deep Fake), Harness powerful real-time speech synthesis, AI discrimination invoice. Convert written text to natural-sounding speech using the latest neural speech synthesizing techniques. Design criteria for voice performance evaluation
Research, design, experiment with and build ML systems, particularly related to voice and search products
Prototype New Features. This means rapidly building prototypes end-to-end, including storage, business logic, and user experience.
R&D in Voice recognition, Synthesize voice across languages and a variety of voices in all supported languages and dialects. Adapt and customize voices for the vocabulary and the tone, including Medical Muscle Tension Dysphonia
Assemble prototypes and MVP. Compress models and optimize inference. Define measures of success for podcast-related initiatives. Build dashboards and self-service tools to enable ongoing monitoring of trends. Develop a deep appreciation of the podcast content landscape and how users engage with podcasts, rooms, videos
Initial work could be done remotely with daily Zoom standups. Preferably you would be located and work in our New York, NY office
Advanced STEM degree: M.S. or Ph.D. with extensive relevant AI experience (Computer Science, Math, Statistics, Economics, Engineering or related field)
Extensive experience utilizing ML/AI methodologies, building data pipelines, exploratory data analysis, and other aspects of the data science process. Comfortable navigating large datasets (advanced SQL). Work with the product on all things product experimentation. Find ideal testing opportunities, as well as set up and measure AB test results.
Experience with libraries ML-frameworks (TensorFlow, Keras, PyTorch, CUDA TensorFlow Serving, Vowpal Wabbit, scikit-learn)
Familiarity with tools such as Python, R, Julia, or MATLAB - Familiarity with AWS or another cloud infrastructure provider (GCP, Azure, etc), Technologies: Kafka, Airflow, Composer. Production experience implementing machine learning pipelines and models at scale in Python, Java, Scala, or similar languages. Proficiency with distributed processing and warehousing frameworks (e.g., Spark, Hadoop, Hive, Tez, etc.). Experience with the research and development workflow/life-cycle for large-scale batch and streaming ML
Excellent written and verbal communication skills, ability to collaborate effectively with non-tech team members and stakeholders Self-motivated, growth-oriented, and driven to pursue solutions to challenging problems.Excellent problem-solving skills
A big "Plus" Deeply curious; interested in how people interact with content, and podcasts specifically. Though not required, previous experience in media, entertainment, or technology is a plus. You are located anywhere. You speak and write English (B2+) is a must
Our Tech Stack:
PyTorch and Tensorflow wrapped in Flask and running in a Kubernetes cluster
Flutter, Node.js with TypeScript running on Firebase Functions and Google Cloud Storage
Great Libraries and Frameworks: NLTK, PyBrain, Caffe, NumPy, SciPy, Pandas, Matplotlib, Keras