Black Duck Software Canada: Advanced Technology Research
Open Source Software and Services are changing the ways the commercial software are being built and delivered. Black Duck Research is at the forefront of conducting state-of-the-art applied research so that enterprises can consume Open Source driven solutions in a more compliant, secure and reliable way. In that context, we conduct state-of-the-art applied research in Data Mining, Machine Learning, Natural Language Processing, Software Engineering and other related areas. Our research team consists of award winning computer scientists, innovators, Ph.D. and Master’s degree students, advised by software industry veterans and faculty members from premier institutes and universities in the US and Canada. We have strong research collaborations with academic institutes through internship and co-op programs.
Black Duck Research Vision
Open Source related data are constantly evolving. These evolving data pose several challenges for Open Source governance and adoption in a compliant and secure way. Many of these challenges stem from the fact that Open Source projects entail large volumes of structured and unstructured data that are difficult to find, manage and analyze. We are applying various Data Mining, Machine Learning and Natural Language Processing solutions to solve some of the most challenging problems related to open source security.
Black Duck Research Projects
Web Services Governance
Data-related web services such as APIs and micro services are important for enterprises to achieve their businesses goals. Users of data services (enterprises or individuals) must ensure that they are compliant with the terms of services (ToS) that govern the usage of those web services. ToS may be revised frequently, requiring re-evaluation of corresponding data services from legal compliance, security and privacy perspectives. Unfortunately, tracking changes in ToS is difficult and time-consuming, especially when hundreds or thousands of data and micro services are in use. Finding changes in ToS with legal and security implications is an even more difficult task. Black Duck Research develops solutions to manage the legal and security risks that come with the usage of web services.
Open Source Software Data Management
Black Duck Software constantly curates massive amounts of open source software and services-related data, such as source code, vulnerabilities and licenses to provide our customers and partners with insights that enable them to consume open source in a more compliant and secure way. It is challenging to develop flexible approaches to maintain, query, browse and organize this information, which contains both unstructured and structured data. Black Duck Research is developing database design principles and computational techniques for managing open source related data efficiently.
Security Data Management and Analytics
Open Source security in many ways is a data management problem. Developers find it difficult to choose open source that are secure and free of publicly known vulnerabilities. An Artificial Intelligence driven security solution, which is trained with real world datasets, is the next frontier of Open Source security. At Black Duck Research, we have the world’s largest database of open source software to move us closer in that direction. In this context, we track publicly known vulnerabilities, licenses, vendor information, and many other pieces of information to train computing machines and build next generation of Open Source security solutions.
Open Source Semantic Search
In the constantly expanding world of open source software and services, developers find it increasingly difficult to choose open source that is compliant, secure and reliable. There are millions of open source software and services publicly available today, and compliance, security and quality related information is extremely difficult for developers to find, making mindful selection of open source an onerous process. Black Duck Research is building a semantic search engine that allows users to describe their requirements in natural language and receive results that meet quality, legal and security requirements.
Black Duck Research Leadership & Collaborators
Research Team Leads:
Zhensong Qian is a data scientist at Black Duck Research. He is working on Open Source Software (OSS) graph modelling and visualization related problems. He earned his PhD degree in the school of computing science from Simon Fraser University, Canada. He has won several research awards and has gained more than 7 years of research expertise in the field of machine learning, big data analytics and database management. He has published numerous research papers at various top international conferences and journals. Leveraging the state-of-the-art machine learning and data science techniques, he enjoys analyzing and modeling the complex, multi-dimensional data to solve real-world problems. During this process he has developed novel solutions to data intensive industry applications.
- Oliver Schulte, Zhensong Qian, FACTORBASE: Multi-Relational Structure Learning with SQL All The Way, Invited Contribution, International Journal of Data Science and Analytics, 2017. (Under Review).
- Oliver Schulte, Zhensong Qian, Arthur E. Kirkpatrick, Xiaoqian Yin, Yan Sun, Fast Learning of Relational Dependency Networks. Invited Contribution. Machine Learning Journal, 2016.
- Zhensong Qian, Oliver Schulte. FACTORBASE : SQL for Multi-Relational Model Learning. NIPS-Learning System Workshop, Montreal, Canada, 2015.
- Zhensong Qian, Oliver Schulte. FACTORBASE: Multi-Relational Structure Learning with SQL All The Way. DSAA 2015 IEEE International Conference on Data Science and Advanced Analytics. Paris, France, 2015.
- Oliver Schulte, Zhensong Qian. SQL for SRL: Structure Learning Inside a Database System. UAI-15, Amsterdam, The Netherlands, 2015.
- Zhensong Qian, Oliver Schulte and Yan Sun. Computing Relational Sufficient Statistics for Large Databases. CIKM 2014, Shanghai, China, 2014.
- Oliver Schulte, Zhensong Qian, Arthur Kirkpatrick, Xiaoqian Yin and Yan Sun. Fast Learning of Relational Dependency Networks. ILP 2014, Nancy, France, 2014.
- Zhensong Qian, Oliver Schulte. Learning Bayes Nets for Relational Data With Link Uncertainty. IJCAI-13, Beijing, China. 2013.
- Fatemeh Riahi, Oliver Schulte, Zhensong Qian, Qing Li Identifying Important Individuals in Relational Data. AAAI-13, Bellevue, WA, USA. 2013.
- Bahareh Bina, Oliver Schulte, Branden Crawford, Zhensong Qian, Yi Xiong. Simple Decisioin Forests for Multi-Relational Classification, Decision Support Systems, 2013.
- Zhensong Qian, Chenghui Zhang. Prediction Model for Harmful Algal Blooms Using Improved Wavelet Networks. The Second Open Science Meeting on HABs and Eutrophication of GEOHAB, Beijing, China, 2009.
- Chenghui Zhang, Zhensong Qian, Wenxing Sun, Peng Ji, Jing Hu. LMBP Neural Network Combination Forecast Model for Red Tide Based on IOWA Operators, Journal of Tianjin University, 2009. (In Chinese.)
Sardar Ali is a Data Scientist at Black Duck Research. He leads the web services data management team that is focused on building technological solutions for governance of web services in commercial applications. Previously, Sardar was with SAP where he helped in automating the web services legal compliance processes. Sardar received his PhD in Computer Science from the University of Victoria, Canada.
- Dimitri Marinakis, Kui Wu, Sardar Ali, and Kyle Weston, "Systems and methods for meter placement in power grid networks," US patent, WO/2014/185921, Nov. 20, 2014.
- Sardar Ali, Kui Wu, Kyle Weston, and Dimitri Marinakis, "A Machine Learning Approach to Meter Placement for Power Quality Estimation in Smart Grid," IEEE Transactions on Smart Grid (TSG), vol. 7, no. 3, pp. 1552-1561, May 2016.
- Sardar Ali, Irfan Ul Haq, Sajjad Rizvi, Naurin Resheed, Unum Sarfraz, Syed Ali Khayam, and Fauzan Mirza, "On Mitigating Sampling-Induced Accuracy Loss in Traffic Anomaly Detection Systems," ACM SIGCOMM Computer Communication Review (CCR), vol. 40, no. 3, pages 4—16, July 2010.
- Sardar Ali, Kyle Weston, Dimitri Marinakis, and Kui Wu, “Intelligent Meter Placement for Power Quality Estimation in Smart Grid,” The 4thIEEE Int'l Conf. on Smart Grid Communications (SmartGridComm), 2013.
- Sardar Ali, Kui Wu, and Dimitri Marinakis, “A maximum-Entropy Based Fast Estimation of Power Quality for Smart Microgrid,” The 4thIEEE Int'l Conf. on Smart Grid Communications (SmartGridComm), 2013.
- Irfan Ul Haq, Sardar Ali, Hasan Khan, and Syed Ali Khayam, "What is the Impact of P2P Traffic on Anomaly Detection?," International Symposium on Research in Attacks, Intrusions and Defenses (RAID), 2010.
Cheng Chen is a Data Scientist at Black Duck Research. He is leading the open source data management team. He received the B.Sc. degree in Computer Science from Beijing University of Posts and Telecommunications, China in 2010, and M.Sc. and Ph.D. degrees in Computer Science from the University of Victoria, Canada in 2012 and 2016, respectively. His research interests are online social networks, recommender systems, and distributed algorithms for graph mining. He has published research papers in top tier conferences and journals.
- Cheng Chen, Kui Wu, Venkatesh Srinivasan, and Xudong Zhang. "A Comprehensive Analysis of Detection of Online Paid Posters." in Recommendation and Search in Social Networks, pp. 101-118. Springer International Publishing, 2015.
- Cheng Chen, Lan Zheng, Venkatesh Srinivasan, Alex Thomo, Kui Wu, Anthony Sukow. "Conflict-Aware Weighted Bipartite B-Matching and Its Application to E-Commerce." in IEEE Transactions on Knowledge & Data Engineering , to appear, 2016.
- Cheng Chen, Kui Wu, Venkatesh Srinivasan, R. Kesav Bharadwaj. "The Best Answers? Think Twice: Identifying Commercial Campagins in the CQA Forums." in Journal of Computer Science and Technology, Volume 30, Issue 4, pp. 810-828, July 2015
- C. Chen, S. Chester, V. Srinivasan, K. Wu, and A. Thomo, "Group-Aware Weighted Bipartite b-Matching," The 25th ACM International Conference on Information and Knowledge Management (CIKM 2016), 2016.
- C. Chen, F. Dong, K. Wu, V. Srinivasan, and A. Thomo, "From Recommendation to Profile Inference (Rec2PI): A Value-added Service to Wi-Fi Data Mining," The 25th ACM International Conference on Information and Knowledge Management (CIKM 2016), 2016
- Guoming Tang, Jie Chen, Cheng Chen, and Kui Wu, "Smart Saver: a Consumer-Oriented Web Service for Energy Disaggregation," IEEE International Conference on Data Mining (ICDM 2014), Demo Track, Shenzhen, China, Dec. 2014. [PDF] [demo@YouTube] [demo@youku]
- Cheng Chen, Lan Zheng, Alex Thomo, Kui Wu, and Venkatesh Srinivasan. "Comparing the staples in latent factor models for recommender systems," in Proc. of the 29th Annual ACM Symposium on Applied Computing - Data Mining track, pp. 91-96, March 2014.
- Cheng Chen, Kui Wu, Venkatesh Srinivasan, and R. Bharadwaj. "The best answers? think twice: online detection of commercial campaigns in the CQA forums," in Proc. of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 458-465, August 2013.
- Cheng Chen, Kui Wu, Venkatesh Srinivasan, and Xudong Zhang. "Battling the internet water army: detection of hidden paid posters, in Proc. of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 116-120, August 2013.
Yiming (Nathan) Zhang
Nathan is a Data Scientist at Black Duck Research. He is currently leading the security team that is creating Artificial Intelligence driven solutions for Cyber Security problems. Besides work, he likes to play and watch tennis. He is also a super fan of Novak Djokovic. He holds a MSc in Electrical and Computer Engineering from University of British Columbia.
- Zhang, Y., Liu, A., Tan, S. N., McKeown, M. J., & Wang, Z. J. (2016). Connectivity-based parcellation of functional SubROIs in putamen using a sparse spatially regularized regression model. Biomedical Signal Processing and Control, 27, 174-183.
- Zhang, Y., Liu, A., Tan, S. N., McKeown, M. J., & Wang, Z. J. (2015, April). Connectivity-based parcellation of putamen using resting state fMRI data. In Biomedical Imaging (ISBI), 2015 IEEE 12th International Symposium on (pp. 34-37). IEEE.
- Tan, S. N., Zhang, Y., Liu, A., Wang, J., & McKeown, M. J. (2016). Altered functional topography of the striatum in resting state fMRI in Parkinson's disease. Parkinsonism & Related Disorders, 22, e81-e82.
Research Leadership Team:
Baljeet is the Vice President of Research at Black Duck Software and founder of Black Duck Software Canada, an R&D division of the company. Baljeet is also an Adjunct Professor at the Sauder School of Business at the University of British Columbia, and Chief Scientific Adviser of TeejLab Inc., a technology advisory company that specializes in Data Science driven innovations. Before heading Black Duck Canada, he was Research Director at SAP Canada Inc.
He is an award-winning computer scientist who has led several research and industry projects in the areas of open source governance, database management, analytics and networks. He evaluates Open Source software trends and related research problems to plan growth opportunities in various business areas for different industries. He specializes in managing innovation projects from the idea/need identification phase through to completion and go-to-market strategies. He designs and implements strategic plans and high-performance teams for research startup by building relationships with academia and industries.
He received his PhD degree in Computing Science from the University of Alberta, Canada, a post-doctorate from the National University of Singapore and a management certificate from the Singapore Management Institute in Singapore. He has published numerous patents and research work at various international journals and conferences. He provides thought leadership and gives lectures at various international venues. He was chosen "Graduate Scholar" by NSERC, Canada during 2006-2009 and "Young Global Scientist" by the Government of Singapore in 2011 and 2012. He was chosen Distinguished Alumni of UNBC in 2017.
- Baljeet Malhotra, et. al., Freeware and Open Source Software Data Management and Analytics System, US20160275116.
- Baljeet Malhotra, et. al., Software Nomenclature System for Security Vulnerability Management, US20160188882.
- Baljeet Malhotra, Complex Event Processing for Moving Objects, US20140180566.
- Baljeet Malhotra, et. al., A System for Policy Management and Analytics, US20150100382.
- Baljeet Malhotra. Chapter: Maritime Data Management and Analytics: A Survey of Solutions Based on Automatic Identification System, Book: Building Sensor Networks From Design to Applications, Published September 5th 2013 by CRC Press.
- Baljeet Malhotra, Daniel Dahlmeier, Naveen Nandan, A Biclustering-Based Classification Framework for Microarray Analysis. PAKDD Workshops 2014: 429-440
- Jianneng Cao, Thomas Kister, Xiang Shili, Baljeet Malhotra, Wee-Juan Tan, Stephane, Kian-Lee Tan. ASSIST: Access Controlled Ship Identification Streams, Trans on Large Scale Data and Knowledge Centered Systems, 11:1-25, 2013.
- Deepen Doshi, Baljeet Malhotra, Stephane, Jasmine Siu Lee Lam. Mining Maritime Schedules for Analysing Global Shipping Networks, IJBIDM, 7(3):186-202, 2012.
- Baljeet Malhotra, Mario A. Nascimento, and Ioanis Nikolaidis. Exact Top-K Queries in Wireless Sensor Networks, IEEE TKDE, 23(10):1513-1525, 2011.
- Baljeet Malhotra, Ioanis Nikolaidis, and Mario A. Nascimento. Aggregation Convergecast Scheduling in Wireless Sensor Networks, Wireless Networks, 17(2):319-335, 2011.
- Baljeet Malhotra, Ioanis Nikolaidis, Janelle Harms. Distributed Classification of Acoustic Targets in Wireless Audio-Sensor Networks. Computer Networks, 52(13):2582-2593, 2008.
- Baljeet Malhotra, Ioanis Nikolaidis, Janelle Harms. A Simple Vehicle Classification Framework for Wireless Audio-Sensor Networks. Int. J. of Telecommunications and Information Technology on Wireless Ad-Hoc Networks, 1:43-50, 2008.
- Baljeet Malhotra, Alex A. Aravind. Path-adaptive On-Site Tracking in Wireless Sensor Networks. IEICE Trans. on Information and Systems, E89-D(2):536-545, 2006.
- Deepen Doshi, Baljeet Malhotra, Stephane Bressan, and Jasmine Lam. Mining Maritime Schedules for Analyzing Global Shipping Networks. To appear in the Int. J. of Business Intelligence and Data Mining, 2012.
- Baljeet Malhotra, Daniel Dahlmeier, Naveen Nandan. A Biclustering-Based Classification Framework for Microarray Analysis. Proc. of Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) Workshops, pp. 429-440, 2014
- Baljeet Malhotra, John A. Gamon and St´ephane Bressan. SALSA: A Software System for Data Management in Field Spectrometry. Proc. of Scientific and Statistical Database Management Conference (SSDBM), Chania, Greece, pp. 634-639, 2012.
- Baljeet Malhotra, Ioanis Nikolaidis, Mario A. Nascimento and St´ephane Bressan. Biased Shortest Path Trees in Wireless Networks. Proc. of the IEEE Int. Performance Computing and Communications Conference (IPCCC), Orlando, USA, pp. 1-6, 2011.
- Deepen Doshi, Baljeet Malhotra, and Stephane Bressan. Modelling and analysis of shipping networks from online maritime schedules. ACM Proc. of the Int. Conf. on Information Integration and Web-based Applications and Services, Dec. 5-7, pp. 206-213, 2011.
- Baljeet Malhotra, Jianneng Cao, Stephane Bressan, Wee-Juan Tan, Thomas Kister, and Kian-Lee Tan. ASSIST: Access Controlled Ship Identification Streams. Proc. of the ACM SIGSPATIAL Int. Symp. on Advances in Geographic Information Systems (ACM-GIS), Chicago, USA, Nov. 1-4, pp. 485-488, 2011.
- Baljeet Malhotra, Ioanis Nikolaidis, and Mario A. Nascimento. WISH-RIBS: Broadcast Scheduling and Opportunistic Failure Recovery in Wireless Networks. Proc. of the Int. Conf. on Communications Networks and Services Research (CNSR), pp. 108-115, 2010.
- Baljeet Malhotra, Ioanis Nikolaidis, and Mario A. Nascimento. Distributed and Efficient Classifiers for Wireless Audio-Sensor Networks. Proc. of the Int. Conf. on Networked Sensing Systems (INSS’08), Kanazawa, Japan, Jun. 17-19, pp. 203-206, 2008.
- Baljeet Malhotra, Mario A. Nascimento, and Ioanis Nikolaidis. Better Tree - Better Fruits: Using Dominating Set Trees for MAX Queries. Proc. of the Int. Workshop on Data Management for Sensor Networks (DMSN in conjunction with VLDB), Auckland, New Zealand, Aug. 24, pp. 1-7, 2008.
- Baljeet Malhotra, Ioanis Nikolaidis, and Janelle Harms. Acoustic Target Classification in Wireless Audio-Sensor Networks. Proc. of the Int. Workshop on Ad-Hoc Wireless Networks (WAHOC), Wisla, Poland, Oct. 15-17, 2007, pp. 955-964.
- Baljeet Malhotra and Alex A. Aravind. An Upper bound Analysis on Ant-Based On-site Tracking in Wireless Sensor Networks. Proc. of the Int. Conf. on Embedded Systems, Mobile Communication and Computing, (ICEMC2), Bangalore, India, Aug. 4-5, 2006.
- Baljeet Malhotra and Alex A. Aravind. OSTSim: On-Site Tracking Simulator. Proc. of the Western Canadian Conf. on Computing Education (WCCCE), Prince George, BC, Canada, May 5-6, 2005. http://web.unbc.ca/wccce05/html/proceedings.html
- Baljeet Malhotra. A Curriculum Guide for Data Warehousing and Business Intelligence. Proc. of the Western Canadian Conf. on Computing Education (WCCCE), Prince George, BC, Canada, May 5-6, 2005. http://web.unbc.ca/wccce05/html/proceedings.html
- Baljeet Malhotra and Alex A. Aravind. A Master-Sink Based Model for On-site Tracking of Multiple Mobile Targets in Wireless Sensor Networks. Proc. of the Int. Conf. on Intelligent Sensing and Information Processing (ICISIP), Chennai, India, Jan. 4-7, 2005, pp. 104-109.
- Baljeet Malhotra, Alex A. Aravind. Energy Efficient On-site Tracking of Mobile Target in Wireless Sensor Networks. Proc. of the Int. Conf. on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP), Melbourne, Australia, Dec. 14-17, pp. 43-48, 2004.
Elisa Bertino (Collaborator)
Elisa is a full Professor at the Department of Computer Science, Purdue University,Director of the Cyber Center (Discovery Park) and Research Director ofCERIAS. She also heads the Database & Information Security Group carrying out groundbreaking research on protection from insider threat, security of the internet of things, sensors, embedded systems, drones, digital identity management, data security and privacy on the cloud, privacy of mobile devices and data trustworthiness.
Professor Bertino’s main research interests cover areas in the fields of information security and database systems. Her research combines both theoretical and practical aspects, addressing as well applications on a number of domains, such as medicine and humanities. She is co-editor in chief ofGeoInformatica and of IEEE Transactions on Dependable and Secure Computing, and editor of the Synthesis Lectures on Information Security, Privacy, and Trust. She has authored several articles in International Journals and Conference Proceedings, and is co-author of several books.
Hasan Cavusoglu (Collaborator)
Hasan is an Associate Professor at the Sauder School of Business at the University of British Columbia. His main research interest is to evaluate strategic impact of information technology (IT) investments. He studies the relationship between the value of IT and the competitive advantages created by the implementation of IT, product variety and differentiation on the Internet. His research also includes evaluating design and implementation issues in information security management systems.
Mario Nascimento (Collaborator)
Mario is a full Professor at the Department of Computing Science, University of Alberta and serves as Chair of the Department. His main research interests lie in the areas of Spatio-Temporal Data Management and Data Management for Wireless Sensor Networks.
Before joining the University of Alberta, Professor Nascimento was a researcher with the Brazilian Agency for Agricultural Research and also an adjunct faculty member with the Institute of Computing of the University of Campinas. In 2007 he was recognized as a Senior Member of the ACM. Professor Nascimento has also served as ACM SIGMOD's Information Director (2002-2005) and ACM SIGMOD Record's Editor-In-Chief (2005-2007). He is currently a member of the VLDB Journal's Editorial Board and of the SSTD Endowment's Board of Directors.
Kui Wu (Collaborator)
Kui is a full Professor at the Department of Computing Science, University of Victoria. Professor Wu’s expertise covers performance modeling as well as the evaluation of networking systems, cloud computing, Quality of Service (QoS) of computer networks and online social networks. He has made significant contributions to network performance modeling with stochastic network calculus, network planning, and information processing and modeling in online social networks.
Professor Wu’s research has been published in several top journals and conferences, including IEEE Transactions on Computers and IEEE Transactions on Parallel and Distributed Systems. His work includes consulting for Streetlight Intelligence Inc. (STI), Canada, where he helped design a wireless sensor network for intelligent streetlight control. He has also worked with Nokia, Canada, to develop new technologies for a fast, privacy-preserving information exchange over mobile social network, and with InteLuma Inc. on cloud-based data management for energy consumption data. His R&D projects with Schneider Electric on power quality analysis of enterprise-level power networks led to two approved US patents.
Gail is a full professor at the University of British Columbia’s Department of Computer Science and Vice President Research & Innovation. Dr. Murphy’s research focuses on improving the productivity of software developers and knowledge workers by providing the necessary tools to identify, manage and coordinate the information that matters most for their work.
Dr. Murphy joined UBC in 1996 and was a key driver of the new Master of Data Science—a professional graduate program launching in Fall 2016—and has been instrumental in championing the creation of a Data Science Institute at the university. She also maintains an active research group with post-doctoral and graduate students. She is a Fellow of the Royal Society of Canada and an Association for Computing Machinery (ACM) Distinguished Scientist, as well as co-founder and Chief Scientist at Tasktop Technologies Incorporated. Dr. Murphy also serves on the editorial boards for Communications of the ACM, and Institute of Electrical and Electronics Engineers Transactions on Software Engineering.
Black Duck Software
611 - 4538 Kingsway, Burnaby
BC, V5H 4T9, Canada