Martin Pavlovski

I’m a hands-on research scientist with a strong research and engineering background in scalable machine learning, deep learning, and big data mining. My work is primarily focused on developing novel ML/DL models with broad applications ranging from digital media and online services (encompassing knowledge graphs and computational advertising), up to power systems, autonomous navigation, and biomedical informatics.

Experience

Research Scientist

Knowledge Graph Science Team

Yahoo Research

Mountain View, CA
Working on developing state-of-the-art deep learning approaches to entity matching and entity reconciliation for the Yahoo Knowledge Graph.
Supervisors: Nicolas Torzec, Director of Research Engineering

Research Scientist

Ad Targeting Team

Yahoo Research

Mountain View, CA
Designed, developed and deployed low-latency algorithms for extreme multi-label classification (XMLC), enabling large-scale interest-based / conversion-based audience targeting in a real-time setting.
Supervisors:
• Jimmy Yang, Senior Director of Research
• Yifan Hu, Senior Director of Research
• Narayan Bhamidipati, Senior Director of Research

Research Assistant

Department of Computer & Information Sciences

Temple University

Philadelphia, PA
Developed cascades of convolutional neural networks and applied them to categorizing abnormal (anomalious) events in power systems based on synchrophasor measurements.
Supervisors: Zoran Obradovic, Laura H. Carnell Professor of Data Analytics at Temple University

Intern Scientist

Targeting, Insights and Measurement Team

Yahoo Research, Verizon Media

Remote (Philadelphia, PA)
Developed a multi-scale graph embedding approach for extreme multi-label classification (XMLC), aimed at selecting relevant items from a large number of possible outputs, while automatically categorizing the outputs into hierarchically nested groups. Apart from demonstrating superior performance compared to other factorization machine-based models on public benchmark datasets, the approach was also leveraged for joint conversion prediction across hundreds of predictive audiences.
Supervisors: Narayan Bhamidipati, Senior Director of Research

Research Assistant

Department of Computer & Information Sciences

Temple University

Philadelphia, PA
Worked on (1) spatiotemporal graph modeling for autonomous navigation of drone swarms in GPS-denied environments as a part of a project with the U.S. Air Force Research Laboratory (AFRL); and (2) Big Data Synchrophasor Monitoring and Analytics for Resiliency Tracking (BDSMART), a project funded by the U.S. Department of Energy (DOE).
Supervisors: Zoran Obradovic, Laura H. Carnell Professor of Data Analytics at Temple University

Selected Publications

Pavlovski, M., Ravindran, S., Gligorijevic, Dj., Agrawal, S., Stojkovic, I., Segura-Nunez, N., Gligorijevic, J.

Proc. 29th ACM SIGKDD Conf. Knowledge Discovery and Data Mining (KDD 2023)

Pavlovski, M., Alqudah, M., Dokic, T., Hai, A. A., Kezunovic, M., Obradovic, Z.

IEEE Transactions on Instrumentation and Measurement

Pavlovski, M., Gligorijevic, J., Stojkovic, I., Agrawal, S., Komirishetty, S., Gligorijevic, Dj., Bhamidipati, N., Obradovic, Z.

Proc. 26th ACM SIGKDD Conf. Knowledge Discovery and Data Mining (KDD 2020)

Pavlovski, M., Zhou, F., Arsov, N., Kocarev, L., Obradovic, Z.

Proc. 27th International Joint Conference on Artificial Intelligence (IJCAI)

Arsov, N.*, Pavlovski, M.*, Basnarkov, L., Kocarev, L.

* Authors contributed equally

Scientific Reports, Nature Publishing Group

Projects

Big Data Synchrophasor Monitoring and Analytics for Resiliency Tracking (BDSMART)

Funding:
U.S. Department of Energy (DOE)

The objective of this project was to utilize Big Data Analytics (BDA) to automate the monitoring of power systems from synchrophasor recordings. Such automation can improve assessing disturbance events that may affect power system resilience. As one of the project’s central aims, the effects of different expert-assisted scenarios for improved labeling of event data, on event classification accuracy, was thoroughly analyzed. To that end, a large-scale comparative analysis was conducted by assessing various traditional as well as more sophisticated event classification models on a dataset involving two years of synchrophasor measurements taken at various locations across one major interconnection of the U.S. power grid. 

The experimental findings on rapidly refined, partially and fully inspected event labels provided evidence that convolutional neural networks (CNNs) outperform traditional models, regardless of the quality of the available event labels. When using such models, the event classification performance improved as more PMU signals were inspected by a domain expert. Smaller fractions of fully inspected signals typically yielded higher accuracy than using them in addition to rapidly refined signals. Finally, it was observed that performance similar to the one obtained using entirely domain-driven labeling may be achieved as long as the expert is experienced enough not to mislabel more than ~5% of the event data.

UAS Detection and Counter-UAS Research and Development

Funding:
U.S. Air Force Research Laboratory (AFRL)

This project had two objectives related to operation of drone swarms in GPS-denied environments: (1) the development of a structured model of environmental deviance to aid in autonomous navigation, and (2) the integration of such a model into a collision avoidance system. Both of these objectives were achieved and the outcomes were tested in the framework of a simulated environment that mimics a GPS-denied scenario. Using data from hundreds of simulated swarm flights, the obtained findings indicated that structured learning can improve navigational accuracy without the need for externally provided position feedback.

Disease Detection and Disease Progression Modeling

Funding:
IQVIA

The objective of this project was to determine whether diagnostics of Alzheimer’s disease (AD) from EMR data alone (without relying on diagnostic imaging) could be significantly improved by applying clinical domain knowledge in data preprocessing and positive patient cohort selection rather than setting naive filters. Data were extracted from a repository of heterogeneous ambulatory EMR data, collected from primary care medical offices all over the United States. Selected Clinically Relevant Positive (SCRP) datasets were used as inputs to a recurrent deep neural network (RNN) model to predict if a given patient may develop AD. The RNN model that used data relevant to AD performed significantly better when learning from the SCRP dataset as opposed to when datasets were selected naively. The integration of qualitative medical knowledge for dataset selection and deep learning techniques provided a mechanism for significant improvement of AD prediction.

Clinical Decision Support System (CDSS) for Multiple Choice Ranking in Cancer Comorbidity

Funding:
King Abdullah University of Science and Technology’s Center Partnership Fund Program

The goal of this project was to identify comorbidities and genes associated with Colorectal cancer (CRC) – the third most common cancer in the United States and the second leading cause of cancer death. The comorbidities of CRC were studied by designing a novel comorbidity network model based on the State Inpatient Database (SID) for the state of California, the records in which were collected under the Healthcare Cost and Utilization Project (HCUP). Ranked lists of comorbidities and comorbidity networks were created, and the prevalence of comorbidities in different stages of CRC was determined. The comorbidity lists were utilized for text mining of PubMed and DisGeNET in order to extract genes associated with CRC. The results of the comorbidity network analyses indicated which comorbidities of CRC are highly expected. The discovered genes could be used to recruit more individuals who would benefit from genetic consultations. The identified associations between the comorbidities, CRC, and shared genes can have important implications on early discovery, and prognosis of CRC.

Big Data Analytics for Clinical Trials Optimization

Funding:
QuintilesIMS

Clinical trials optimization was facilitated by developing DeepMatch (DM), a novel approach based on the recent advances in deep learning. DM was designed to learn from both investigator and trial-related heterogeneous data sources and rank investigators based on their expected enrollment performance on new clinical trials.

A large-scale evaluation was conducted on 2618 studies in which the proposed ranking-based framework improved the current state-of-the-art by up to 19% on ranking investigators and up to 10% on detecting top/bottom performers when recruiting investigators for new clinical trials. These findings indicated that DM can provide substantial improvement over current industry standards in several regards: (1) the enrollment potential of the investigator list, (2) the time it takes to generate the list, and (3) data-informed decisions about new investigators.

Talks

Landing Jobs Post-Graduation in CST: An Alumni Panel

Invited talk at the Landing Jobs Post-Graduation in CST: An Alumni Panel organized by the Graduate Student Organization (GSO) for the College of Science and Technology (CST), Temple University, March 2023. [Virtual (remote) talk]

Northeast Student Data Corps: Data Science Career Panel

Invited talk at the Northeast Student Data Corps: Data Science Career Panel organized by the Northeast Big Data Innovation Hub, April 2021. [Virtual (remote) talk | video recording available online]

Time-Aware Representation Learning

Invited talk as part of the Yahoo Research “Faculty Research and Engagement Program (FREP)” Talk Series, June 2020. [Virtual (remote) talk]

Spatially-Aware Mixture Models for Lightning-Induced Outages

Lecture at Texas A&M University, College Station, TX, USA, October 2019.

Generalization-Aware Structured Regression towards Balancing Bias and Variance

New Voices talk at the NSF US-Serbia and West Balkan Data Science Workshop, Belgrade, Serbia, August 2018.

Awards &
Achievements

2020 CST Outstanding Research Assistant Award

This award is given annually to only one graduate student from each CST department (Biology, Chemistry, Computer and Information Science, Mathematics, and Physics) who, during their research assistantship, accomplished the following:

  1. demonstrated excellence in research characterized by the significance of the problem that was addressed, the novelty of the approach used, and the rigor of the research;
  2. conducted research that represents a positive reflection of the department, the College, and the entire Temple community.

Best Student Paper Award at the 16th Int’l Conf. Artificial Intelligence Applications and Innovations (AIAI)

This award is given to the authors of papers that are selected as best among the papers published at AIAI on the basis of novelty, innovation, technical excellence, and potential impact in the field.

In the case of AIAI 2020, among 97 accepted papers, our paper “Autonomous Navigation for Drone Swarms in GPS-Denied Environments Using Structured Learning” (written by Power, W., Pavlovski, M., Saranovic, D., Stojkovic, I., & Obradovic, Z.) was selected as the best student paper, and each of the authors including myself were recipients of the award.

Early Career Research Award

Awarded to young researchers showing excellence at the onset of their research careers on the basis of presenting their research in the workshop. In particular, I was selected as the recipient of this award based on my research talk and poster presentation related to my publication [R1].

[R1] Pavlovski, M., Zhou, F., Arsov, N., Kocarev, L., & Obradovic, Z. (2018). Generalization-Aware Structured Regression towards Balancing Bias and Variance. In Proc. 27th International Joint Conference on Artificial Intelligence (IJCAI) (pp. 2616-2622).

Best Diploma Thesis Award

Following the graduation ceremony, I was awarded as a graduate whose diploma thesis1 was selected as the best among the theses of all students from the same generation of graduates from the Faculty of Computer Science and Engineering, under the Ss. Cyril and Methodius University, Skopje, Macedonia. The thesis selection criteria included technical and research excellence reaching beyond the expected diploma thesis requirements, potential impact and significance in the field of Computer Science.

It is also worth noting that my diploma thesis had a further reaching impact in the field as its contents were later published in a peer-reviewer paper [R2] in Scientific Reports, a Nature Publishing Group journal (impact factor at the time of publication: 4.122; current impact factor: 4.997; 5-year impact factor: 5.516).

[R2] Arsov, N.*, Pavlovski, M.*, Basnarkov, L., & Kocarev, L. (2017). Generating highly accurate prediction hypotheses through collaborative ensemble learning. Scientific reports, 7(1), 1-9.
__________________________________________________________
1 Often used interchangeably with bachelor’s thesis or graduation thesis.
* Authors contributed equally.

Best Student Award

This award is administered by the Faculty of Computer Science and Engineering, under the Ss. Cyril and Methodius University, Skopje, Macedonia. It is granted to outstanding undergraduate students having a GPA above 9.5 out of 10.00 (on a 5.00 – 10.00 scale, no curve).

Education

Jan 2018 – May 2021

Temple University, Philadelphia, PA

Degree: Ph.D. in Computer and Information Science

Dissertation: “Learning from Structured Data: Scalability, Stability and Temporal Awareness

Advisor: Zoran Obradovic

Sep 2011 – Nov 2015

Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University, Skopje, Macedonia

Degree: B.Eng. in Electrical Engineering and Information Technologies

Major: Informatics and Computer Engineering

Thesis: “Bi-level Interactive Ensemble Classifier based on Cumulative Collaboration”

Advisors: Lasko Basnarkov & Ljupco Kocarev

Sep 2007 – Jun 2011

High School “Josip Broz – Tito”, Skopje, Macedonia

Specialized in mathematics