Ozgur Ozturk

Ozgur Ozturk

Senior Data Scientist @ Nasdaq (eVestment), adjunct instructor in MS in Data Science Program @ Southern Methodist University. Uses Python, scikit-learn, Spark. Worked as Data Scientist at Philips Wellcentive, and CareerBuilder and as full stack developer at Wolters Kluwer, AirWatch VMware, NCR, General Dynamics, Georgia Tech, and Oracle. Taught programming at Gwinnett Tech. Received his PhD from the Ohio State University in CSE. Interests include Big Data, Cloud Computing, Machine Learning and Lean/Agile Methodologies. Together with Kulsoom Abdullah, He is the cofounder of Data Science Pros channel: https://www.youtube.com/channel/UCJ1u-63nhgzstwgDmyocWPw?sub_confirmation=1 and Data Science Pros meetup: https://www.meetup.com/DataSciencePros/


(Login to save favorite sessions)

Document Classification Workshop (NLP, Text Processing, Machine Learning) [Python, scikit-learn]

You can forklift this project in your MatrixDS account if you would like to follow along: https://community.platform.matrixds.com/community/project/5d5d5a3cd3af39f8147b8100/dashboard We will learn about NLP techniques to process a text corpus to prepare for Machine Learning applications, including

  • stop word removal

  • tokenization

  • lemmatizing

  • n-grams

  • count-vectorizer, and TFIDF

  • applying supervised learning (random forests) to vector representation of documents for classification

  • Hyper Parameter Tuning

  • Model evaluation

Based on the NLP jupyter notebooks in repo 1 And repo 2

IT Topics
2:30 PM