Martin Pavlovski

I’m a machine learning researcher and engineer with a strong background in scalable machine learning, deep learning, and big data mining. My work is focused on developing novel ML/DL models with applications spanning digital media and online services (specifically, computational advertising and web-scale knowledge graphs), power systems, autonomous navigation, and biomedical informatics.

Experience

Senior Machine Learning Engineer

Data Science Team

February 2025
Present

Samsung Electronics America

Mountain View, CA

More details

Supervisors: Mi Zhang, Director of Machine Learning & Data Science

Research Scientist

Knowledge Graph Science Team

July 2023
February 2025

Yahoo Research

Mountain View, CA

More details

Developed state-of-the-art deep learning approaches to entity matching and entity reconciliation for the Yahoo Knowledge Graph.

Supervisors: Nicolas Torzec, Director of Research Engineering

Research Scientist

Ad Targeting Team

June 2021
July 2023

Yahoo Research

Mountain View, CA

More details

Designed, developed and deployed low-latency algorithms for extreme multi-label classification (XMLC), enabling large-scale interest-based / conversion-based audience targeting in a real-time setting.

Supervisors:
• Jimmy Yang, Senior Director of Research
• Yifan Hu, Senior Director of Research
• Narayan Bhamidipati, Senior Director of Research

Research Assistant

Department of Computer & Information Sciences

August 2020
May 2021

Temple University

Philadelphia, PA

More details

Developed cascades of convolutional neural networks and applied them to categorizing abnormal (anomalious) events in power systems based on synchrophasor measurements.

Supervisors: Zoran Obradovic, Laura H. Carnell Professor of Data Analytics at Temple University

Intern Scientist

Targeting, Insights and Measurement Team

June 2020
August 2020

Yahoo Research, Verizon Media

Remote (Philadelphia, PA)

More details

Developed a multi-scale graph embedding approach for extreme multi-label classification (XMLC), aimed at selecting relevant items from a large number of possible outputs, while automatically categorizing the outputs into hierarchically nested groups. Apart from demonstrating superior performance compared to other factorization machine-based models on public benchmark datasets, the approach was also leveraged for joint conversion prediction across hundreds of predictive audiences.

Supervisors: Narayan Bhamidipati, Senior Director of Research

Selected Publications

Conference

Extreme Multi-Label Classification for Ad Targeting using Factorization Machines

Pavlovski, M., Ravindran, S., Gligorijevic, Dj., Agrawal, S., Stojkovic, I., Segura-Nunez, N., Gligorijevic, J.

Proc. 29th ACM SIGKDD Conf. Knowledge Discovery and Data Mining (KDD 2023)

Computational advertising, Web services

Journal

Hierarchical Convolutional Neural Networks for Event Classification on PMU Measurements

Pavlovski, M., Alqudah, M., Dokic, T., Hai, A. A., Kezunovic, M., Obradovic, Z.

IEEE Transactions on Instrumentation and Measurement

Power systems

Conference

Time-Aware User Embeddings as a Service

Pavlovski, M., Gligorijevic, J., Stojkovic, I., Agrawal, S., Komirishetty, S., Gligorijevic, Dj., Bhamidipati, N., Obradovic, Z.

Proc. 26th ACM SIGKDD Conf. Knowledge Discovery and Data Mining (KDD 2020)

Computational advertising, Web services

Conference

Generalization-Aware Structured Regression towards Balancing Bias and Variance

Pavlovski, M., Zhou, F., Arsov, N., Kocarev, L., Obradovic, Z.

Proc. 27th International Joint Conference on Artificial Intelligence (IJCAI)

Core ML

Journal

Generating Highly Accurate Prediction Hypotheses Through Collaborative Ensemble Learning

Arsov, N.*, Pavlovski, M.*, Basnarkov, L., Kocarev, L.

* Authors contributed equally

Scientific Reports, Nature Publishing Group

Biomedical informatics, Core ML

Projects

Big Data Synchrophasor Monitoring and Analytics for Resiliency Tracking (BDSMART)

September 2019
May 2021

Funding:
U.S. Department of Energy (DOE)

More details

The objective of this project was to utilize Big Data Analytics (BDA) to automate the monitoring of power systems from synchrophasor recordings. Such automation can improve assessing disturbance events that may affect power system resilience. As one of the project’s central aims, the effects of different expert-assisted scenarios for improved labeling of event data, on event classification accuracy, was thoroughly analyzed. To that end, a large-scale comparative analysis was conducted by assessing various traditional as well as more sophisticated event classification models on a dataset involving two years of synchrophasor measurements taken at various locations across one major interconnection of the U.S. power grid.

The experimental findings on rapidly refined, partially and fully inspected event labels provided evidence that convolutional neural networks (CNNs) outperform traditional models, regardless of the quality of the available event labels. When using such models, the event classification performance improved as more PMU signals were inspected by a domain expert. Smaller fractions of fully inspected signals typically yielded higher accuracy than using them in addition to rapidly refined signals. Finally, it was observed that performance similar to the one obtained using entirely domain-driven labeling may be achieved as long as the expert is experienced enough not to mislabel more than ~5% of the event data.

UAS Detection and Counter-UAS Research and Development

September 2019
April 2020

Funding:
U.S. Air Force Research Laboratory (AFRL)

More details

This project had two objectives related to operation of drone swarms in GPS-denied environments: (1) the development of a structured model of environmental deviance to aid in autonomous navigation, and (2) the integration of such a model into a collision avoidance system. Both of these objectives were achieved and the outcomes were tested in the framework of a simulated environment that mimics a GPS-denied scenario. Using data from hundreds of simulated swarm flights, the obtained findings indicated that structured learning can improve navigational accuracy without the need for externally provided position feedback.

Disease Detection and Disease Progression Modeling

July 2018
June 2019

Funding:
IQVIA

More details

The objective of this project was to determine whether diagnostics of Alzheimer’s disease (AD) from EMR data alone (without relying on diagnostic imaging) could be significantly improved by applying clinical domain knowledge in data preprocessing and positive patient cohort selection rather than setting naive filters. Data were extracted from a repository of heterogeneous ambulatory EMR data, collected from primary care medical offices all over the United States. Selected Clinically Relevant Positive (SCRP) datasets were used as inputs to a recurrent deep neural network (RNN) model to predict if a given patient may develop AD. The RNN model that used data relevant to AD performed significantly better when learning from the SCRP dataset as opposed to when datasets were selected naively. The integration of qualitative medical knowledge for dataset selection and deep learning techniques provided a mechanism for significant improvement of AD prediction.

Clinical Decision Support System (CDSS) for Multiple Choice Ranking in Cancer Comorbidity

April 2018
June 2019

Funding:
King Abdullah University of Science and Technology’s Center Partnership Fund Program

More details

The goal of this project was to identify comorbidities and genes associated with Colorectal cancer (CRC) – the third most common cancer in the United States and the second leading cause of cancer death. The comorbidities of CRC were studied by designing a novel comorbidity network model based on the State Inpatient Database (SID) for the state of California, the records in which were collected under the Healthcare Cost and Utilization Project (HCUP). Ranked lists of comorbidities and comorbidity networks were created, and the prevalence of comorbidities in different stages of CRC was determined. The comorbidity lists were utilized for text mining of PubMed and DisGeNET in order to extract genes associated with CRC. The results of the comorbidity network analyses indicated which comorbidities of CRC are highly expected. The discovered genes could be used to recruit more individuals who would benefit from genetic consultations. The identified associations between the comorbidities, CRC, and shared genes can have important implications on early discovery, and prognosis of CRC.

Big Data Analytics for Clinical Trials Optimization

July 2017
June 2018

Funding:
QuintilesIMS

More details

Clinical trials optimization was facilitated by developing DeepMatch (DM), a novel approach based on the recent advances in deep learning. DM was designed to learn from both investigator and trial-related heterogeneous data sources and rank investigators based on their expected enrollment performance on new clinical trials.

A large-scale evaluation was conducted on 2618 studies in which the proposed ranking-based framework improved the current state-of-the-art by up to 19% on ranking investigators and up to 10% on detecting top/bottom performers when recruiting investigators for new clinical trials. These findings indicated that DM can provide substantial improvement over current industry standards in several regards: (1) the enrollment potential of the investigator list, (2) the time it takes to generate the list, and (3) data-informed decisions about new investigators.

Talks

Landing Jobs Post-Graduation in CST: An Alumni Panel

March 2023
Remote

Invited talk at the Landing Jobs Post-Graduation in CST: An Alumni Panel organized by the Graduate Student Organization (GSO) for the College of Science and Technology (CST), Temple University, March 2023. [Virtual (remote) talk]

Northeast Student Data Corps: Data Science Career Panel

April 2021
Remote

Invited talk at the Northeast Student Data Corps: Data Science Career Panel organized by the Northeast Big Data Innovation Hub, April 2021. [Virtual (remote) talk | video recording available online]

Time-Aware Representation Learning

June 2020
Remote

Invited talk as part of the Yahoo Research “Faculty Research and Engagement Program (FREP)” Talk Series, June 2020. [Virtual (remote) talk]

Spatially-Aware Mixture Models for Lightning-Induced Outages

October 2019
Texas A&M University, College Station, TX, USA

Lecture at Texas A&M University, College Station, TX, USA, October 2019.

Generalization-Aware Structured Regression towards Balancing Bias and Variance

August 2018
Belgrade, Serbia

New Voices talk at the NSF US-Serbia and West Balkan Data Science Workshop, Belgrade, Serbia, August 2018.

Awards &
Achievements

2020 CST Outstanding Research Assistant Award

October 2020
Awarded by: College of Science and Technology (CST) at Temple University

More details

This award is given annually to only one graduate student from each CST department (Biology, Chemistry, Computer and Information Science, Mathematics, and Physics) who, during their research assistantship, accomplished the following:

demonstrated excellence in research characterized by the significance of the problem that was addressed, the novelty of the approach used, and the rigor of the research;
conducted research that represents a positive reflection of the department, the College, and the entire Temple community.

Best Student Paper Award at the 16th Int’l Conf. Artificial Intelligence Applications and Innovations (AIAI)

June 2020
Awarded by: AIAI 2020 Organizing Committee

More details

This award is given to the authors of papers that are selected as best among the papers published at AIAI on the basis of novelty, innovation, technical excellence, and potential impact in the field.

In the case of AIAI 2020, among 97 accepted papers, our paper “Autonomous Navigation for Drone Swarms in GPS-Denied Environments Using Structured Learning” (written by Power, W., Pavlovski, M., Saranovic, D., Stojkovic, I., & Obradovic, Z.) was selected as the best student paper, and each of the authors including myself were recipients of the award.

Early Career Research Award

August 2018
Awarded by: NSF US-Serbia and West Balkan Data Science Workshop Committee

More details

Awarded to young researchers showing excellence at the onset of their research careers on the basis of presenting their research in the workshop. In particular, I was selected as the recipient of this award based on my research talk and poster presentation related to my publication [R1].

[R1] Pavlovski, M., Zhou, F., Arsov, N., Kocarev, L., & Obradovic, Z. (2018). Generalization-Aware Structured Regression towards Balancing Bias and Variance. In Proc. 27th International Joint Conference on Artificial Intelligence (IJCAI) (pp. 2616-2622).

Best Diploma Thesis Award

March 2016
Awarded by: Macedonian Society for ETAI (Electronics, Telecommunications, Automation and Informatics)

More details

Following the graduation ceremony, I was awarded as a graduate whose diploma thesis¹ was selected as the best among the theses of all students from the same generation of graduates from the Faculty of Computer Science and Engineering, under the Ss. Cyril and Methodius University, Skopje, Macedonia. The thesis selection criteria included technical and research excellence reaching beyond the expected diploma thesis requirements, potential impact and significance in the field of Computer Science.

It is also worth noting that my diploma thesis had a further reaching impact in the field as its contents were later published in a peer-reviewer paper [R2] in Scientific Reports, a Nature Publishing Group journal (impact factor at the time of publication: 4.122; current impact factor: 4.997; 5-year impact factor: 5.516).

[R2] Arsov, N.*, Pavlovski, M.*, Basnarkov, L., & Kocarev, L. (2017). Generating highly accurate prediction hypotheses through collaborative ensemble learning. Scientific reports, 7(1), 1-9.
__________________________________________________________
¹ Often used interchangeably with bachelor’s thesis or graduation thesis.
* Authors contributed equally.

Best Student Award

March 2015, 2016
Awarded by: Faculty of Computer Science and Engineering, Skopje

More details

This award is administered by the Faculty of Computer Science and Engineering, under the Ss. Cyril and Methodius University, Skopje, Macedonia. It is granted to outstanding undergraduate students having a GPA above 9.5 out of 10.00 (on a 5.00 – 10.00 scale, no curve).

Education

Jan 2018 – May 2021

Temple University, Philadelphia, PA

Degree: Ph.D. in Computer and Information Science

Dissertation: “Learning from Structured Data: Scalability, Stability and Temporal Awareness”

Advisor: Zoran Obradovic

Sep 2011 – Nov 2015

Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University, Skopje, Macedonia

Degree: B.Eng. in Electrical Engineering and Information Technologies

Major: Informatics and Computer Engineering

Thesis: “Bi-level Interactive Ensemble Classifier based on Cumulative Collaboration”

Advisors: Lasko Basnarkov & Ljupco Kocarev

Sep 2007 – Jun 2011

High School “Josip Broz – Tito”, Skopje, Macedonia

Specialized in mathematics

Martin Pavlovski

Experience

Senior Machine Learning Engineer

Samsung Electronics America

Research Scientist

Yahoo Research

Research Scientist

Yahoo Research

Research Assistant

Temple University

Intern Scientist

Yahoo Research, Verizon Media

Selected Publications

Extreme Multi-Label Classification for Ad Targeting using Factorization Machines

Hierarchical Convolutional Neural Networks for Event Classification on PMU Measurements

Time-Aware User Embeddings as a Service

Generalization-Aware Structured Regression towards Balancing Bias and Variance

Generating Highly Accurate Prediction Hypotheses Through Collaborative Ensemble Learning

Projects

Big Data Synchrophasor Monitoring and Analytics for Resiliency Tracking (BDSMART)

UAS Detection and Counter-UAS Research and Development

Disease Detection and Disease Progression Modeling

Clinical Decision Support System (CDSS) for Multiple Choice Ranking in Cancer Comorbidity

Big Data Analytics for Clinical Trials Optimization

Talks

Landing Jobs Post-Graduation in CST: An Alumni Panel

Northeast Student Data Corps: Data Science Career Panel

Time-Aware Representation Learning

Spatially-Aware Mixture Models for Lightning-Induced Outages

Generalization-Aware Structured Regression towards Balancing Bias and Variance

Awards & Achievements

2020 CST Outstanding Research Assistant Award

Best Student Paper Award at the 16th Int’l Conf. Artificial Intelligence Applications and Innovations (AIAI)

Early Career Research Award

Best Diploma Thesis Award

Best Student Award

Education

Awards &
Achievements