Tuesday, August 20, 2019
Data Warehouse Characteristics And Definition Information Technology Essay
Data Warehouse Characteristics And Definition Information Technology Essay A data warehouse provides an integrated view of the customer and their relationship with the organisation by bringing together the data from a number of operational systems. A data warehouse provides a complete picture of the enterprise by focusing on its enterprise-wide components like profits, sales and customers by looking beyond the traditional information view structure. These components require information from various sources as they have both organisational and process boundaries. The data warehouses are made up of large databases. These databases store the integrated data of the enterprise. This data may be obtained from both, internal as well as external sources. Internal sources of data refer to the data that is obtained from the operational systems of the enterprise. External data sources are the government bodies, third party organisations, business partners, customers etc. These databases also store the metadata that gives a description of the content of data that is st ored in the data warehouse. The data warehouses are designed and constructed in a denormalized manner. This is done to replicate the dimensional view of the business by the user. This makes it possible to better analyze, examine and summarize the data. This can be done over different periods of time and at different levels of detail when the data structure is denormalized. The data warehouses have a time dimension where all the data is time stamped. This is done so that the data can support the reports that are used to compare the figures from the earlier months or years. It is helpful for the decision takers of the organisation to better understand the trends and patters of the market and customer behaviour over the period of time. The data warehouses contain both atomic as well as summarised data. The atomic data is the data that provides a great level of detail. This makes responding to queries a faster process when the tasks are at the highest level of detail. As the name sugges ts, the summarised data provides a quick summary of data and does not go much into detail. Thus only storing summarised data is not an option. However, the storage of atomic data requires much larger space. 2.2 Purpose Previously, the data was not easily accessible because it was stored in environments which were unfriendly and not easy to access. The data warehouses solve this problem by providing access to the integrated organisational data that was stored in such environments. The data warehouses provide security either by their front-end applications or from the database servers. As a result, the users can now have a secure connection to the warehouse from their personal computers. Because the data warehouses provide integrated data, the need for users to understand and access operational data is greatly reduced. The information provided by the data warehouses is consistent and is of high quality. They are the common source of information for the organisation. Due to this there is consistency in the data and the organisations decision making process becomes much easier. They are also used to store historical data. Actual historical data is not stored on operational systems but is simply loaded and integrated with the other data in the warehouse so that it can be accessed quickly. Data warehouses provide the ability to their users to view the data at different levels of detail and go through it as and how they require. Such freedom to view the data from different angles improves the analysis process by reducing the time and effort required to collect, format and present the information from the data. To make the information technology infrastructure of the organisation stronger, the data warehouses distinguish between analytical and operational processes. They provide additional system archite cture to execute the decisions. With the focus of the data warehouses on achieving the requirements for business decisions, they are the best suited systems for the redesigned decision-making business processes. 2.3 Trends in data warehousing Data warehousing is no longer just a concept or used for educational purposes only. It has become mainstream. Almost 90% of the multinational corporations either use data warehousing or are planning to implement it. Data warehousing has transformed the way business analysis and decision making takes place. The organisations that already use it have witnessed the enormous benefits that it has to offer. Web technologies have only added to the benefits provided by data warehousing and has paved the way for easy delivery of critical information. There have been many changing trends in the field of data warehousing since its evolution. Scientists have always felt that technology has been the driving force behind data warehousing. But now, the softwares being used have had a significant progress rate and in the years to come, we can expect data warehousing to take a major leap not only in software but also in optimising queries, indexing big tables, improved data compression and expanding dimensional modelling. Real-time warehousing Real -time data warehousing is increasingly becoming the focus of top executives in the organisations. As compared to conventional data warehousing, real-time data warehouses provide the most recent views of the business and are dynamic in nature. A conventional data warehouse is more passive in nature and provides historical trends. The tools of business intelligence along with the data warehouse have been mainly used to make strategic decisions. But now they are required more for making tactical decisions on a day to day basis. There is a lot of pressure in the companies as they are expected to come up with real-time information with everyone connected to important business processes. Providing real-time information has increased the productivity of the companies tremendously. However, there a number of challenges that the company has to face while trying to do so. Data types Previously the companies included mostly numeric structured data in their data warehouses. This divided the decision support systems in to two parts, one that worked with structured data and the other one that worked with knowledge management involving unstructured data. Most of the structured data is numeric and most of the unstructured data is in the form of images. Now, if we were to consider a situation where the decision maker of the company has to perform an analysis in order to find out about the top-selling products, where he/she would like to take a look at the images of the products for further decisions, then this wouldnt have been possible. This fact has been realised by the organisations and therefore the need is felt to integrate both the structured as well as the unstructured data in the data warehouses. In order to include the unstructured data in the data warehouses, the vendors are considering multimedia like images and texts as just another data type. They are stor ed as binary large objects and are considered to be a part of the relational data. They are defined as user-defined-types by the user-defined-functions. But it is not possible to simply consider all binary large objects as relational data type. That is because, if we were to consider video clips, there would be a need for a server that is able to support delivery of multiple video streams at a given rate along with audio synchronisation. After having included unstructured data in the data warehouses, there should also be a way to search for this data. Without proper ways to search for this data, the integration of unstructured data will prove rather useless. Vendors have now started providing search engines so that the user is able to search for all the information that he/she requires. The example of such a mechanism would be the query by images mechanism. Its purpose is to let the user search for pre-indexed images on the basis of their shape, size and colour. For text-data, the s earch engine retrieves the documents based on words, characters, phrases etc. The use of search mechanisms for audio and video data is still in the research stages. Another data-type would be the spatial data. Including the spatial data-type in the data warehouses adds a great deal of value to the data warehousing systems. Spatial data answers questions like average income of the people living near the store, average driving distance for the people coming to the store etc. Examples of spatial data include address, city, county, state etc. The database vendors do realise the importance of this type of data and some of them do add special SQL extensions to their products in order to include such data. Data visualisation Data visualization is necessary to improve the performance of the user in terms of analysis. The users expect to see the query results in the form of charts or graphics. If the query results are in the form of spread sheets, it affects the quickness and ease by which the users can carry out the analysis. It also means that the data warehouse is outdated. If we consider the last few years, there have been many trends in the way the data visualization softwares work. Now, the variety of charts to view different types of data has increased. For example, there are pie charts available to view the numerical results. Dynamic charts are available which allow the users to see the results, manipulate it and check for the new views online. The newer versions of the data visualization softwares make it possible to see a large number of results at once and complex data structures. Some of the more advanced visualization techniques available today are the chart manipulation technique, drill down technique and the advanced iteration technique. Companies have also started adopting scorecards and dashboards as a means to view the performance. Different types of users have different needs. The business users require bar charts, the scientific users require constellation graphs, and the analysts require three dimensional views and so on. The latest trends in the softwares have made it possible to fulfil the ever changing needs of the current users of the data warehousing systems. Parallel processing One of the most important aspects of data warehousing is delivering top-quality performance. The users of the data warehouses are constantly performing large complex queries. These queries read enormous amounts of data to give out the results. Again, to analyse these results, a large number of queries are executed one after the other by individual users. Some of the other functions involved are the loading of data and creating indexes for the data. Both the processes can be slow because of the huge amounts of data and large number of indexes. For the data warehouses to give out quality performance, it is necessary to speed up these processes like query processing, data loading and indexing. An efficient way to do this is to achieve parallel processing. This is done by utilising both hardware options as well as software techniques together. For parallel processing, the hardware options may include multiple CPUs, many server nodes, memory modules, high speed links between interconnecte d nodes etc. In the software implementation of parallel processing, the hardware configuration needs to be chosen properly. The reason for this is that if the hardware configuration is not proper, then the operating systems and the databases will be unable to use the hardwares parallel features. Parallel server and parallel query are the two options that the database vendors generally provide for parallel processing. The parallel server option makes it possible to have separate database instances for each of the hardware nodes. The database instances are also allowed to access a common set of database files. On the other hand, the parallel query option supports the important functions like query processing, data loading and index creation. Considering the current technology, executing the data warehouse without parallel processing is not at all an option to be considered. Tools for query processing The tools that are required for query processing are the most important set of tools in data warehousing. The success of a data warehouse is not possible without them. Because of this, the vendors have started coming out with new and improved query tools since the past few years. Some of the query tools that are of most importance and have undergone significant changes by the vendors are flexible representation, aggregate awareness, crossing subject areas, multiple heterogeneous sources, overcoming SQL limitations etc. Browser tools Here the term browser is not restricted to the use of web browsers alone. One of the major advantages of data warehousing is that the users are able to execute queries in the data warehouse that generate reports without any help or assistance from someone who is in the I.T. field. Here, the browser tools come in handy when the users want to go through the metadata and search for specific chunks of information. This allows the users to directly go to the data warehouse. Their need is also felt when a data warehouse for the company is being developed and the I.T. team has to go through all the data structures, data sources and business rules. Some of the major improvements that the browser tools have gone through in the past few years are: extensible tools that allow to define any type of data or information objects, open APIs, navigation through hierarchical groupings, web browsing and search techniques to go through information catalogues etc. Data Fusion In order to provide an integrated view of the enterprise, the data warehouse stores data collected from a number of sources. The data may be taken from different operational systems running on different platforms, each using a different DBMS. Data may also be taken from a number of external sources. Data fusion is the technology that fuses together all this different types of data from multiple sources and stores it in the data warehouse. It provides a wider scope and the real-time integration of data from the monitoring systems. A vast amount of research is being carried on in order to improve this technology as it has a direct application in the field of data warehousing. Apart from the integration of data from multiple sources, the data fusion technology is also expected to address the problem of finding the right information at the right time as it can be a difficult task due to the vast amounts of data that is stored. The data fusion technology as for now is still in its researc h phase and therefore the vendors are not hurrying to develop the tools for data fusion. Integrating ERP and Data Warehouses Enterprise resource planning was introduced in the markets in the 1990s. The goal of ERP was to help in the decision making as well as the taking of necessary actions from one integrated environment. It was also supposed to provide the companies with the integrated corporate data repositories. For this, the data was cleansed, transformed and integrated in one place. But soon, the companies that implemented these systems realised that the relational databases that were designed and normalised to carry out the business operations were not able to provide the necessary strategic information. Also the data from the external sources and the operational systems was not included in the ERP data repositories. As a result, the companies that were planning to acquire the ERP systems started to consider the integration of ERP systems with data warehousing. There are three major options that are available that allow the companies to do so. They are the ERP data warehouse, the custom developed data warehouse and the hybrid ERP data warehouse enhanced with third party tools. The ERP data warehouse option allows the companies to implement data warehousing with the current available functionality and wait for further enhancements. But the only negative about this option is that the enhancements may take a long time to come. The 2nd option that is the custom-developed data warehouse allows the companies to have a customized data warehouse along with the use of 3rd party tools to get the data from the ERP datasets. Although reclaiming and loading the data from the ERP datasets is not an easy task. The 3rd option that is the hybrid ERP data warehouse enhanced with 3rd party tools allows the combination of the functionalities of the existing data warehouse with the additional functionalities from the third party tools. The companies need to sel ect the option that will be most suitable for their corporation. Data Warehousing and CRM The benefits of having a CRM-ready data warehouse are substantial. Now-a-days, there is an increasing competition among the companies and also, there is a need to retain the existing customers and attract new ones. The companies have now started targeting individual customers and fulfilling their needs instead of having a mass focus group. To achieve this, the companies have adopted customer relationship management. To build a data warehouse that is customer ready, there is a need to develop CRM-ready data warehouses. But, doing so is by no means an easy task. The data warehouses need to have all the information of every transaction with every individual customer. What this means is that each unit of each sale of every product to each customer must be recorded in the data warehouse. Not only the sales data, but also, information regarding every other type of interaction with the customer needs to be recorded. The CRM-ready data warehouse becomes flexible with such detailed recording of data. There is a huge amount of increase in the volumes of data. These large amounts of data can be stored across multiple storage management devices. They are accessed by using common data warehouse tools. Also, there is a need to improve functions like cleansing and transformation functions that are more complex in nature. These are some of the major efforts to achieve a CRM-ready data warehouse. Although, the previous tools of data warehousing are not quite capable of adopting the specialized requirements of customer-focused applications. The Web and Data warehouse The introduction of internet has deeply affected the way in which computing and communication has been taking place previously. From its start in 1969 with only four host computers, it has come a long way with a huge amount of increase in the host computers, almost up to 95 million hosts by 2000. And it still continues to grow with exponential speeds. In the year 2000, there were almost 26 million web-sites and 150 million users using the available web technologies for one reason or the other. Now, the companies have come up with intranets (private networks) and extranets (public networks) in order to properly communicate with their employees, customers and business partners. The web has transformed itself in to a universal information delivery system. Today, there is no business that can survive without making use of the available web technologies. E-commerce has now become the main focus of the businesses and there is an annual investment of 300 billion dollars which is soon expected to cross the 1 trillion mark. Therefore, it has become extremely important for the companies to transform their data warehouses to make them web-enabled in order to make use of the tremendous potential that the web technologies have to offer. But while doing so, the companies need to bring the data warehouses to the web and also bring the web to the data warehouses. Bringing the warehouse to the web: During the early times of the evolution of data warehousing, the data warehouses were developed only for the top-level management such as the managers, analysts and a few others to help them with critical analysis and decision making. The necessary information was delivered to this user group by making use of the client/server environment. But today, the needs of the businesses have increased tremendously. The warehousing technology has been made available to all the members included in the corporations value chain. It is not just confined to a select group of people. Important information is not only provided to employees alone but also to the customers, business partners and the suppliers. In todays highly competitive times, these changes are necessary to increase the productivity of all the members of the company. This can only be possible with the help of internet along with web technology. The way the users of the data warehouse retrieve, analyse and share the information is changed drastically with the help of the new information delivery mechanism that is the web technology. The information delivery will be a little different having new components and the internet interface will provide a browser, search engine, a homepage, hypertext links, downloadable Java etc. The important requirements of the users while bringing the data warehouse to the web are strict security, self data access, unified metadata, high performance etc. Bringing the web to the warehouse: In order to bring the web to the warehouse, the company needs to collect the number of clicks the company website gets from all the visitors and then perform the traditional data warehousing functions. This must be accomplished in real-time and involves extraction, transformation and loading of the number of clicks to the data warehouse. Dimensional schemas are then developed from this data and the information delivery systems are launched. The click data helps in analysing how exactly the visitors went about through the company web-site. Also important information like what made the visitors purchase the company product, how they were attracted and what made the visitors come back to the web-site can also be recorded. The web-house as it is known has become an extremely important tool for retaining, identifying and prioritising the e-commerce customers. The combination of data warehousing and web technology has become very important to all the businesses in the 21st century. Using web technologies for information delivery and integrating the click data from the company web-sites for analysis has become the need of the day.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.