Show simple item record

dc.contributor.authorWilliamson, David Wen_NZ
dc.date.available2011-04-07T03:12:38Z
dc.date.copyright2006-11-22en_NZ
dc.identifier.citationWilliamson, D. W. (2006, November 22). A lightweight data integration architecture (Dissertation, Master of Science). Retrieved from http://hdl.handle.net/10523/1288en
dc.identifier.urihttp://hdl.handle.net/10523/1288
dc.description.abstractContent syndication specifications such as Atom have become a popular mechanism to disseminate information across the Internet, with many sites providing Atom feeds for users to subscribe to and consume. Such a scenario typifies the originally intended use of Atom; however, our research has explored an alternative domain for this syndication technology. This research has evaluated Atom for its potential as a lightweight platform to support data integration from a set of data sources to a single target database. The implementation of the Atom-based architecture that we developed for this research combines freely available server-side scripting technology with the simplified asynchronous connection scheme that content syndication technology offers. We use several use cases each with different degrees of complexity, yet sharing common requirements, as a guide in the development of our prototype. In order to evaluate our Atom-based architecture, our experimental design required the construction of an evaluation framework that measured the prototype's impact upon the network and computation resources it consumed. These measurements were compared with observations of response time requirements between operational and analytical processing systems. The experiments carried out to evaluate the Atom-based data integration architecture have shown that the architecture has potential in facilitating a lightweight data integration solution. Our research has shown that an Atom-based architecture is capable of operating within a range of conditions and environments, and with further development, would be capable of greater processing efficiency and wider compatibility with other types of data structures.en_NZ
dc.format.mimetypeapplication/pdf
dc.subject.lcshQA76 Computer softwareen_NZ
dc.titleA lightweight data integration architectureen_NZ
dc.typeDissertationen_NZ
dc.description.versionUnpublisheden_NZ
otago.bitstream.pages101en_NZ
otago.date.accession2006-11-23en_NZ
otago.schoolInformation Scienceen_NZ
thesis.degree.disciplineInformation Scienceen_NZ
thesis.degree.nameMaster of Science
thesis.degree.grantorUniversity of Otagoen_NZ
thesis.degree.levelMasters Dissertationsen_NZ
otago.openaccessOpen
dc.identifier.eprints484en_NZ
otago.school.eprintsDatabase Research Centreen_NZ
otago.school.eprintsInformation Scienceen_NZ
dc.description.referencesAdali, S., Candan, K. S., Papakonstantinou, Y. and Subrahmanian, V. S. (1996). Query caching and optimization in distributed mediator systems, The 1996 ACM SIGMOD International Conference on Management of Data, ACM Press, New York, NY, USA, pp. 137–146. Altova (2005). Altova mapforce database mapping. http://www.altova.com/products/mapforce/xml_to_db_database_mapping.html, accessed 7 October 2005. Arasu, A., Babu, S. and Widom, J. (2004). CQL: A Language for Continuous Queries over Streams and Relations, Lecture Notes in Computer Science, 2921 edn, Springer. AtomEnabled (2005). AtomEnabled. http://www.atomenabled.org, accessed 9 February 2005. Atzeni, P., Ceri, S., Paraboschi, S. and Torlone, R. (1999). Database Systems: Concepts, Languages & Architectures, McGraw-Hill, London. Babcock, B., Babu, S., Mayur, D., Motwani, R. and Widom, J. (2002). Models and issues in data stream systems, ACM Principles Of Database Systems (PODS), ACM Press, Madison, Wisconsin, USA, pp. 1–16. Baldoni, R., Contenti, M. and Virgillito, A. (2003). The evolution of publish/subscribe communication systems, Future Directions of Distributed Computing, Vol. 2584, Springer Verlag. Batini, C., Lenzerini, M. and Navathe, S. B. (1986). A comparative analysis of methodologies for database schema integration, ACM Computing Surveys 18(4): 323–364. Beck, R., Weitzal, T. and Konig, W. (2002). Promises and pitfalls of sme integration, The 15th Bled Electronic Commerce Conference, Bled, Slovenia, pp. 567–583. Berners-Lee, T., Connolly, D. and Swick, R. R. (1999). Web architecture: Describing and exchanging data. http://www.w3.org/1999/04/WebData. Berners-Lee, T. and Fischetti, M. (1999). Weaving the Web, Orion Business, London. Berners-Lee, T., Hendler, J. and Lassila, O. (2001). The Semantic Web, Scientific American. http//www.scientificamerican.com/2001/0501issue/0501berners-lee.html. Boyd, M., Kittivoravitkul, S., Lazantis, C., McBrien, P. and Rizopoulos, N. (2004). AutoMed: A BAV data integration system for heterogeneous data sources, Lecture Notes in Computer Science, Springer-Verlag, pp. 82–97. Breitbart, Y., Komondoor, R., Rastogi, R., Seshadri, S. and Silberschatz, A. (1999). Update propagation protocols for replicated databases, Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, ACM Press, Philadelphia, Pennsylvania, United States, pp. 97–108. Buretta, M. (1997). Data Replication Tools and Techniques for Managing Distributed Information, John Wiley & Sons, New York. Cali, A., Calvanese, D., De Giacomo, G. and Lenzerini, M. (2002). Data Integration under Integrity Constraints, number 2348 in Lecture Notes in Computer Science, Springer. Cali, A., Calvanese, D., Giacomo, G. D. and Lenzerini, M. (2002). On the Expressive Power of Data Integration Systems, number 2503 in Lecture Notes in Computer Science, Springer. Calvanese, D., Giacomo, G. D., Lenzerini, M., Nardi, D. and Rosati, R. (1998). Information integration: Conceptual modelling and reasoning support, The 3rd IFCIS International Conference on Cooperative Information Systems (CoopIS’98), IEEE Computer Society Press, New York, NY, pp. 280–291. Campailla, A., Chaki, S., Clarke, E., Jha, S. and Helmut, V. (2001). Efficient filtering in publish-subscribe systems using binary decision diagrams, The 23rd International Conference on Software Engineering (ICSE’01), IEEE Computer Society, Toronto, Canada, pp. 04–43. Carney, D., Cetinternel, M., Cherniack, C., Convey, C., Lee, S., Siedman, G., Stonebraker, M., Tatbul, N. and Zdonik, S. (2002). Monitoring streams - a new class of data management applications, The 28th Very Large Databases (VLDB) Conference, Hong Kong, China, pp. 215–226. Chawathe, S., Garcia-Molina, H., Hammer, J., Ireland, K., Papakonstantinou, Y., Ullman, J. and Widom, J. (1994). The TSIMMIS pro ject: Integration of heterogeneous information sources, The Information Processing Society of Japan (IPSJ) Conference 1994, Tokyo, Japan. Connolly, T. M. and Begg, C. E. (2005). Database Systems: A Practical Approach to Design, Implementation, and Management, Addison-Wesley, Essex, UK. Date, C. J. (2004). An Introduction to Database Systems, eighth edn, Addison-Wesley, New York. Duschka, O. M., Genesereth, M. R. and Levy, A. Y. (2000). Recursive query plans for data integration, The Journal of Logic Programming 43(1): 49–73. Ethereal (2005). Ethereal: A network protocol analyser. http://www.ethereal.com, accessed 7 October 2005. Eugster, P. T., Felber, P. A., Guerraoui, R. and Kermarrec, A. (2003). The many faces of publish/subscribe, ACM Computing Surveys 35(2): 114–131. Farooq, U., Parsons, E. W. and Ma jumdar, S. (2004). Performance of publish/subscribe middleware in mobile wireless networks, WOSP ’04: Proceedings of the 4th International Workshop on Software and Performance, ACM Press, New York, NY, USA, pp. 278–289. Fensel, D., Hendler, J., Lieberman, H. and Wahlster, W. (eds) (2003). Spinning the Semantic Web, MIT Press, Cambridge, MA. Friedman, M., Levy, A. and Millstein, T. (1999). Navigational plans for data integration, 16th National Conference on Artificial Intel ligence (AAAI’99), AAAI Press/The MIT Press, pp. 67–73. Ge, Z., Ji, P., Kurose, J. and Towsley, D. (2003). Matchmaker: Signalling for dynamic publish/subscribe applications, The 11th IEEE International Conference on Network Protocols (ICNP’03), IEEE Computer Society, Los Alamitos, CA, p. 222. Goh, C. H., Bressan, S., Madnick, S. and Siegel, M. (1999). Context interchange: new features and formalisms for the intelligent integration of information, ACM Transactions on Information Systems 17(3): 270–293. Golab, L. and Özsu, M. T. (2003a). Issues in data stream management, SIGMOD Record 32(2): 5–14. Golab, L. and Özsu, M. T. (2003b). Processing sliding window multi-joins in continuous queries, The 2003 International Conference on Very Large Databases, Morgan Kaufmann, pp. 500–511. Google (2006). Google Data APIs Overview. http://code.google.com/apis/gdata/overview.html, accessed 24 May 2006. Gray, J., Homan, P., Korth, H. F. and Obermarck, R. (1981). A strawman analysis of the probability of wait and deadlock, IBM Technical Report RJ3066. Gupta, A., S., O. D., Agrawal, D. and El Abbadi, A. (2004). Meghdoot: Content-based publish/subscribe over P2P networks, in H. A. Jacobsen (ed.), Midd leware 2004, International Federation of Information Processing (IFIP), pp. 254–273. Haas, L. M., Miller, R. J., Niswonger, B., Roth, M. T., Schwarz, P. M. and Wimmers, E. L. (1999). Transforming heterogeneous data with database middleware: Beyond integration, IEEE Data Engineering Bul letin 22(1): 31–36. IDEAlliance (2006). About PRISM. http://www.prismstandard.org/about/, accessed 05 November 2006. Inmon, W. H. (1993). Building the Data Warehouse, John Wiley & Sons, New York. Koivunen, M. and Miller, E. (2001). W3C Semantic Web activity, Semantic Web Kick-Off in Finland: Vision, Technologies, Research and Applications, HIIT Publications, Helsinki, Finland, pp. 27–43. Lenzerini, M. (2002). Data integration: A theoretical perspective, ACM Principles Of Database Systems (PODS), ACM, Madison, Wisconsin, USA, pp. 233–246. Levy, A. Y. (2000). Logic-based techniques in data integration, in J. Minker (ed.), Logic Based Artificial Intel ligence, Kluwer Academic, Dordrecht, pp. 575–595. Madden, S. and Franklin, M. J. (2002). Fjording the stream: An architecture for queries over streaming sensor data, The 18th International Conference on Data Engineering (ICDE’02), IEEE, p. 0555. Madhavan, J. and Halevy, A. Y. (2003). Composing mappings among data sources, The 29th Very Large Databases (VLDB) Conference, Berlin, pp. 572–583. Manola, F., Miller, E. and McBride, B. (2004). RDF Primer. W3C Recommendation. http://www.w3.org/TR/rdf-primer/. McBrien, P. and Poulovassilis, A. (2003). Data integration by bi-directional schema transformation rules, The 19th International Conference on Data Engineering (ICDE’03), IEEE, pp. 227–238. McGuinness, D. L. and van Harmelen, F. (2004). OWL Web Ontology Language. http://www.w3.org/TR/owl-features/. Nicola, M. and Jarke, M. (2000). Performance modelling of distributed and replicated databases, IEEE Transactions on Knowledge and Data Engineering 12(4): 645–672. Nottingham, M. and Sayre, R. (2005). The Atom Syndication Format, 2005. http://tools.ietf.org/html/rfc4287. O’Neil, P. and O’Neil, E. (2001). Database: Principles, Programming, Performance, Morgan Kaufmann, San Francisco, CA. Özsu, M. T. and Valduriez, P. (1999). Principles of Distributed Databases, 2nd edn, Prentice Hall, New Jersey. Ozzie, J., Moromisato, G. and Suthar, P. (2005). XML developer center: Simple sharing extensions for RSS and OPML. http://msdn.microsoft.com/xml/rss/sse, accessed 24 May 2006. Pascoe, R. T. and Penny, J. P. (1990). Construction of interfaces for the exchange of geographic data, International Journal of Geographical Information Systems 4(2): 147–156. Powers, S. (2003). Practical RDF, O’Reilly, Sebastopol, CA. Progress Software (2006). Progress Apama algorithmic trading platform. http://www.progress.com/realtime/products/apama/index.ssp, accessed 10 April 2006. Silberschatz, A., Korth, H. F. and Sudarshan, S. (2006). Database System Concepts, fifth edn, McGraw-Hill, New York. Tomasic, A., Raschid, L. and Valduriez, P. (1998). Scaling access to heterogeneous data sources with DISCO, IEEE Transactions on Knowledge and Data Engineering pp. 808–823. Ullman, J. D. (1997). Information integration using logical views, Database Theory - ICDT ’97. 6th International Conference Proceedings pp. 19–40. Vargas, L., Bacon, J. and Moody, K. (2005). Integrating databases with publish/subscribe, The 25th International Conference on Distributed Computing Systems Workshops (ICDCSW’05), IEEE Computer Society, pp. 392–397. Vivometrics (2005). Vivometrics technology backgrounder. http://www.vivometrics.com/site/pdfs/find.php?file=VivoMetrics_TechnologyBackground, accessed 15 April 2006. Vivometrics (2006). Advanced real-time monitoring ensemble for first responders deployed by U.S. military. http://www.vivometrics.com/site/press_pr20060411.html, accessed 15 April 2006. Wang, J., Jin, B. and Li, J. (2004). An Ontology-Based Publish/Subscribe System, number 3231 in Lecture Notes in Computer Science, Springer. Widom, J. (1995). Research problems in data warehousing, CIKM ’95: Proceedings of the fourth international conference on Information and knowledge management, ACM, pp. 25–30. Wiederhold, G. (1993). Intelligent information integration, Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data (SIGMOD ’93), ACM Press, New York, NY, pp. 434–437. Wiederhold, G. (1995). Mediation in information systems, ACM Computing Surveys 27(2): 265–267. Xu, L. (2001). Efficient and scalable on-demand data streaming using UEP codes, Proceedings of the Ninth ACM International Conference on Multimedia, ACM Press, New York, NY, pp. 70–78. Yu, C. and Popa, L. (2004). Constraint based XML query rewriting for data integration, Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data (SIGMOD ’04), ACM Press, New York, NY, pp. 371–382.en_NZ
 Find in your library

Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record