Recent Collaborators


ASU/CSE Awards

Top 5% Faculty Award, Fulton Schools of Engineering, 2014
Top 5% Faculty Award, Fulton Schools of Engineering, 2012

Best Teacher Award, Fulton Schools of Engineering, 2013
Best Teacher Award, Fulton Schools of Engineering, 2013

Top 5% Faculty Award, Fulton Schools of Engineering, 2012
Top 5% Faculty Award, Fulton Schools of Engineering, 2012

Distinguished Service'09
Distinguished Service in Computer Science and Engineering Award, 2009

Researcher of the Year'08
Researcher of the Year Award,2008

Service Faculty'07
Service Faculty of the Year Award, 2007

tDAR (NSF+Mellon Foundation)

On the order of 50,000 archaeological field projects are undertaken annually in the US alone, mostly in anticipation of disturbance or destruction due to development. In order to take advantage of this explosion of data, archaeologists have come to realize that tools that can assist in integrating data are desperately needed. Because the large-scale archaeological data needed to address the most compelling research questions are almost never collected by a single research team, the incapacity to integrate data across projects cripples scientists' efforts to recognize phenomena operating on large spatio-temporal scales and to conduct crucial comparative studies.

Data and knowledge integration are costly processes, both in terms of the time and expertise they require from domain experts and in terms of the computational complexities that arise when the sources are not trivially compatible. Consequently, most existing solutions rely on a one-size-fits-all approach, where the data are integrated once (i.e., the cost of the integration is paid upfront) and then the integrated data or knowledge-bases are used as is, until a new data source is made available. However, such snapshot based integration solutions cannot be effectively applied when the data sources are autonomous and dynamic; instead data should be integrated on an as-available and as-needed basis. Furthermore, users of the integrated data --e.g., scientists, decision makers-- often have a high degree of knowledge of the domain and strong beliefs about the kinds of integration operations that would be acceptable. They are thus indispensable to the integration process: their needs, assumptions, and knowledge must be fully leveraged. In particular, we observe that overly-eager, early conflict resolution (where some alternative interpretations are deemed inapplicable without sufficient evidence) may be detrimental to the effective exploration and use of the available knowledge.

The goal of our work is to tackle the computational challenges underlying a user driven integration (UDI) system, keeping in mind the human constraints and challenges that underlie the technical considerations. Combining insights from data management, information retrieval, and information integration, UDI will avoid the common computational pitfalls faced by existing systems. The work will lead to transformative user-driven information integration techniques applicable to diverse application domains in which: (a) data collection is inherently inconsistent, context dependent, and subject to imprecision; (b) schemas and ontologies are overlapping and evolving, and their interpretations are query- and user-dependent; and (c) many inferential steps separate interpretations from observational data. While the technical outcomes will include source description, data and query translation across schemas, and query processing across uncertain, heterogeneous sources, the key technical and intellectual impacts will be in algorithms and data structures that can help bridge the semantic gap between the expert user and the system through a user-driven integration process based on individual user feedback. The algorithms we will develop tackle both the semantic gap and the computational complexity of the problem. UDI, thus, will be able to efficiently and effectively resolve mismatches between data, data sources, meta-data, and users’ assumptions and hypotheses.

Through an NSF grant “Archaeological Data Integration for the Study of Long-Term User and Social Dynamics” (NSF HSD/BCS 0624341, 11/1/06–10/31/09) we have already developed tDAR (the Digital Archaeological Record) cyberinfrastructure to support archaeological research tDAR seeks to ensure the long-term research utility of the corpus of irreplaceable archaeological knowledge that has been generated over more than a century. In addition, as part of our dissemination and sustainability efforts for tDAR, we obtained a grant (“Digital Antiquity: Enabling and Enhancing Preservation and Access to Archaeological Information”, 2009-2010) from the Andrew W. Mellon Foundation to develop a digital repository for archaeology. The repository is building on the tDAR cyberinfrastructure as the core of a Fedora-based repository and be managed and maintained at a new center at ASU.

While our past research addressed many of these challenges through a novel “query-driven ad-hoc integration” paradigm that leverages context-driven integration (as opposed to pre-queryprocessing integration), there remain important challenges in the effective use of the many data sources available through tDAR. Therefore, building on our prior NSF-funded tDAR work and in parallel to the ongoing “Digital Antiquity” cyberinfrastructure development effort funded by the Mellon Foundation, in this proposal we focus on the needs of the users/consumers and investigate computational mechanisms that will help them leverage their assumptions and a priori knowledge in achieving an integration that is most suitable for their individual purposes.

tDAR Project Web Site: http://tdar.org

Related grants:
NSF-III#1016921. "One Size Does Not Fit All: Empowering the User with User-Driven Integration." (2010-2012)
Andrew W. Mellon Foundation. “Digital Antiquity: Enabling and Enhancing Preservation and Access to Archaeological Information”, 2009-2010)

NSF project site: http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=1016921

Related past grants:
“Archaeological Data Integration for the Study of Long-Term User and Social Dynamics” (NSF HSD/BCS 0624341, 11/1/06–10/31/09)

Short bio


K. Selcuk Candan is a Professor of Computer Science and Engineering at the School of Computing, Informatics, and Decision Science Engineering at the Arizona State University and is leading the EmitLab research group. He joined the department in August 1997, after receiving his Ph.D. from the Computer Science Department at the University of Maryland at College Park.


Prof. Candan's primary research interest is in the area of management of non-traditional, heterogeneous, and imprecise (such as multimedia, web, and scientific) data.  His various research projects in this domain are funded by diverse sources, including the National Science Foundation, Department of Defense, Mellon Foundation, and DES/RSA (Rehabilitation Services Administration). He has published over 140 articles and many book chapters. He has also authored 9 patents. Recently, he co-authored a book titled "Data Management for Multimedia Retrieval" for the Cambridge University Press and co-edited "New Frontiers in Information and Software as Services: Service and Application Design Challenges in the Cloud" for Springer.


Prof. Candan served an editorial board member of one of the most respected database journals, the Very Large Databases (VLDB) journal. He is currently an associate editor for the IEEE Transactions on Multimedia and the Journal of Multimedia. He has served in the organization and program committees of various conferences. In 2006, he served as an organization committee member for SIGMOD'06, the flagship database conference of the ACM and one of the best conferences in the area of management of data. In 2008, he served as a PC Chair for another leading, flagship conference of the ACM, this time focusing on multimedia research (MM'08). More recently, he served as a program committee group leader for ACM SIGMOD’10. He also served in the review board of the Proceedings of the VLDB Endowment (PVLDB). In 2011, he served in the Executive Committee of ACM SIGMM.


In 2010, he was a program co-chair for ACM CIVR'10 conference and a program group leader for ACM SIGMOD'10. In 2011, he is serving as a general co-chair for the ACM MM'11 conference. In 2012, he served as a general co-chair for ACM SIGMOD'12. In 2015, he will serve as a general co-chair for IEEE International Conference on Cloud Engineering (IC2E'15).


He is a member of the Executive Committee of ACM SIGMOD and an ACM Distinguished Scientist.


For his curriculum vitae, please click here.